Creating an s-curve based on data points

Creating an s-curve based on data points - python

I have a series of data points which form a curve I do not have an equation for, and for which i have not been able to satisfyingly calculate an equation with either libreoffice or the online curve fitting tools in the first 2 pages of google results.
I would like the equation for the curve and ideally a python implementation of calculating y values for a given x value along that curve in case there are unexpected hoops to jump through. Failing that I would like any more elegant python solution than a list of elif statements incrementing y if x is high enough for it to increase by a whole number, which is the ugly solution of last resort - my immediate plans do not require decimal precision.
The curve crosses the zero line at 10, and every whole number incrementation of y requires x to be incremented by one more whole number than the previous, so y1 is reached at x11, y2 at x13, y3 at x16 etc, with the curve bending in the other direction in the negatives such that y-1 is at x9, y-2 is at x7 etc. I suspect i am missing something obvious as far as finding the curve equation when i already have this knowledge.
In addition to trying to use libreoffice calc and several online curve-fitting websites to no avail, i have tried slicing the s-curve (I have given up on searching the term sigmoid function as all my results are either related to neural nets or expect my y values to never exceed +-1) into two logarythmic curves, which almost works - 5 *(np.log(x) - 11) gets something frustratingly close to the top half of the curve, but which i ultimately haven't been able to use - in addition to crossing the number line at 9 it produced some odd behaviour when I returned round() rounded y values directly, displaying results in the negative 40s when returned directly, but seeming to work fine when those numbers are fed into other calculations.
If somebody can give me two working logarythms that round to the right numbers for x values between 0 and 50 that is good enough for this project.
Thank you for your time and patience.
-EDIT-
these are triangular numbers apparently, x-10 is equal to the number of dots in a triangle with y dots on each side, what I need is the inverse of the triangular number formula. Thank you to everyone who commented.

As mentioned in my edit, the y i am trying to find is the triangular root of x. This solution:
def get_triangle_root(x: int) -> int:
current_value = x - 10
negative = False
if current_value < 0:
current_value = current_value * -1
negative = True
current_value = np.sqrt(1 + (current_value * 8))
current_value = (current_value - 1)/2
if negative == True:
current_value = current_value * -1
current_value = int(current_value)
return current_value
seems to work fine for now. Curiously, when I calculate (-1+(sqrt(1+(8*x)))/2) using libreoffice or google, rather than getting the same results this python script gives me, i get results 0.5 lower than the actual triangle root. Unimportant at this time, but I am curious as to what would cause it.
At any rate, thank you to everyone who lent their time to me. I apologise to anyone looking at this question who was looking for a universal solution for creating S-curves rather than just one that works for my specific task, but feel it is best to attach an answer to this question so as not to prevail on more people's time.
-EDIT- changed python script to handle negative triangular numbers as well, something i had overlooked in excitement.

What you're looking for are a class of functions called "Sigmoid functions". They have a characteristic S-shape. Go to Wolfram and play around with some common Sigmoid funcs, remembering that the "a" in a function, f(x-a), shifts the entire curve left or right, and appending a value "b" to the function, f(x-a) + b will shift the curve up and down. Using a coefficient of "c", f(c*x - a) + b here acts as a scalar. That should get you where you want to be in short time.
Example: (1/(1 + C*exp(-(x + A)))) + B

Related

Binary-like search

It's theoretical question.
exercise from leetcode as basis.
My solution for task is binary search. But question is not about it.
I found perfect solution on Discuss tab.
(next code has been taken from there)
class Solution:
def mySqrt(self, x: int) -> int:
low, high= 1, x
while low<high:
high = (low + high) // 2
low = x // high
return high
It works perfect. My question is:
For regular binary search we take middle of sequence and depending of comparison result remove excessive part (left or right) next repeat till result.
What is this implementation based on?
This solution cut part of sequence right after middle and small part from start.

This code isn't based on binary search. It's based instead on adapting the ancient "Babylonian method" to integer arithmetic. That in turn can be viewed as anticipating an instance of Newton's more-general method for finding a root of an equation.
Keeping distinct low and high variables isn't important in this code. For example, it's more commonly coded along these lines:
def intsqrt(n):
guess = n # must be >= true floor(sqrt(n))
while True:
newguess = (guess + (n // guess)) // 2
if guess <= newguess:
return guess
guess = newguess
but with more care taken to find a better initial guess.
BTW, binary search increases the number of "good bits" by 1 per iteration. This method approximately doubles the number of "good bits" per iteration, so is much more efficient the closer the guess gets to the final result.

This method is subtle, though was known of the Babylonians (see Tim's answer).
Assume that h > √x. Then
l = x/h < √x and
(l+h)/2 > √x.
The first property is obvious. For the second, observe that 1. and 2. imply
x+h² > 2h√x or (h-√x)^2 > 0, which is true.
So h remains above √x, but it gets closer and closer (because (l+h)/2 < h). And when the computation is made with integers, there is a moment such that l≥h.
How was this method discovered ?
Assume that you have an approximation h of √x and we want to improve it, with a correction δ. We write x = (h-δ)² = h²-2hδ + δ² = x. If we neglect δ², then we draw h-δ = (h²+x)/2h = (h+x/h)/2, which is our (h+l)/2.

How to minimize occuring errors in Verlet Integration for orbital mechanics

it's my first time asking a question here and since i only started some days ago with coding i may need your help defining my problem a bit more precise.
I would like to simulate some keplerian orbits, but i have reoccurring problems with the required precision. For example when my point comes to close to the center of gravity it slings it out of every sensible orbit. (My theory is that the acceleration increases so drastically compared to the time-window dt, that it reaches escape velocity before gravity 'has a chance' to pull it back.) Another problem is that the orbit itself rotates around the center of gravity. I have not studied this problem in detail but i guess, this time small errors sum up thus creating this flower-like effect.
First thought why these errors occur:
As written in the title i use a simple Verlet integration to approximate the true solution. I know there are other possibilities like Runge-Kutta method, but i saw people doing some nice simulations with a simple Euler-Approximation. So i thought a second order Verlet-Approximation should be sufficient. Maybe this is not the case, so should i use another method of approximation?
Second idea:
I simply coded it badly and there are better ways to handle it, keeping numerical errors minimal. For further analysis i upload a snip of my small code here:
t = t + dt
i = 1
while t <= T and i <= n:
#A_list was the idea of calculating the acceleration more efficient, i append the necessary entries, calculate the acceleration and delete the entries again to repeat the process
A_list.append(x_list[i])
A_list.append(y_list[i])
r = np.sqrt(A_list[0]**2 + A_list[1]**2)
A_x = -(G*M/r**2) * A_list[0] * 1/r
A_y = -(G*M/r**2) * A_list[1] * 1/r
x_i1 = 2*A_list[0] - x_list[i-1] + A_x * dt**2
y_i1 = 2*A_list[1] - y_list[i-1] + A_y * dt**2
del A_list[1]
del A_list[0]
i = i + 1
t = t + dt
x_list.append(x_i1)
y_list.append(y_i1)
t_list.append(t)
I hope this was somewhat readable and there is someone who would like to help a young programmer out. :)
Pictures for clarification:
enter image description here
enter image description here

Numerical solution of exponential equation using Python or other software

I want to find numerical solutions to the following exponential equation where a,b,c,d are constants and I want to solve for r, which is not equal to 1.
a^r + b^r = c^r + d^r (Equation 1)
I define a function in order to use Scipy.optimize.fsolve:
from scipy.optimize import fsolve
def func(r,a,b,c,d):
if r==1:
return 10**5
else:
return ( a**(1-r) + b**(1-r) ) - ( c**(1-r) + d**(1-r) )
fsolve(funcp,0.1, args=(5,5,4,7))
However, the fsolve always returns 1 as the solution, which is not what I want. Can someone help me with this issue? Or in general, tell me how to solve (Equation 1). I used an online numerical solver long time ago, but I cannot find it anymore. That's why I am trying to figure it out using Python.

You need to apply some mathematical reasoning when choosing the initial guess. Consider your problem f(r) = (51-r + 51-r) − (41-r + 71-r)
When r ≤ 1, f(r) is always negative and decreasing (since 71-r is growing much faster than other terms). Therefore, all root-finding algorithms will be pushed to right towards 1 until reaching this local solution.
You need to pick a point far away from 1 on the right to find the nontrivial solution:
>>> scipy.optimize.fsolve(lambda r: 5**(1-r)+5**(1-r)-4**(1-r)-7**(1-r), 2.0)
array([ 2.48866034])
Simply setting f(1) = 105 is not going to have any effect, as the root-finding algorithm won't check f(1) until the very last step(note).
If you wish to apply a penalty, the penalty must be applied to a range of value around 1. One way to do so, without affecting the position of other roots, is to divide the whole function by (r − 1):
>>> scipy.optimize.fsolve(lambda r: (5**(1-r)+5**(1-r)-4**(1-r)-7**(1-r)) / (r-1), 0.1)
array([ 2.48866034])
(note): they may climb like f(0.1) → f(0.4) → f(0.7) → f(0.86) → f(0.96) → f(0.997) → … and stop as soon as |f(x)| < 10-5, so your f(1) is never evaluated

First of your code seems to uses a different equation than your question: 1-r instead of just r.
Valid answers to the equation is 1 and 2.4886 approximately as can be seen here. With the second argument of fsolve you specify a starting estimate. I think due to 0.1 being close to 1 you get that result. Using the 2.1 as starting estimate I get the other answer 2.4886.
from scipy.optimize import fsolve
def func(r,a,b,c,d):
if r==1:
return 10**5
else:
return ( a**(1-r) + b**(1-r) ) - ( c**(1-r) + d**(1-r) )
print(fsolve(func, 2.1, args=(5,5,4,7)))
Chosing a starting estimate is tricky as many give the following error: ValueError: Integers to negative integer powers are not allowed.

Solving recursive sequence

Lately I've been solving some challenges from Google Foobar for fun, and now I've been stuck in one of them for more than 4 days. It is about a recursive function defined as follows:
R(0) = 1
R(1) = 1
R(2) = 2
R(2n) = R(n) + R(n + 1) + n (for n > 1)
R(2n + 1) = R(n - 1) + R(n) + 1 (for n >= 1)
The challenge is writing a function answer(str_S) where str_S is a base-10 string representation of an integer S, which returns the largest n such that R(n) = S. If there is no such n, return "None". Also, S will be a positive integer no greater than 10^25.
I have investigated a lot about recursive functions and about solving recurrence relations, but with no luck. I outputted the first 500 numbers and I found no relation with each one whatsoever. I used the following code, which uses recursion, so it gets really slow when numbers start getting big.
def getNumberOfZombits(time):
if time == 0 or time == 1:
return 1
elif time == 2:
return 2
else:
if time % 2 == 0:
newTime = time/2
return getNumberOfZombits(newTime) + getNumberOfZombits(newTime+1) + newTime
else:
newTime = time/2 # integer, so rounds down
return getNumberOfZombits(newTime-1) + getNumberOfZombits(newTime) + 1
The challenge also included some test cases so, here they are:
Test cases
==========
Inputs:
(string) str_S = "7"
Output:
(string) "4"
Inputs:
(string) str_S = "100"
Output:
(string) "None"
I don't know if I need to solve the recurrence relation to anything simpler, but as there is one for even and one for odd numbers, I find it really hard to do (I haven't learned about it in school yet, so everything I know about this subject is from internet articles).
So, any help at all guiding me to finish this challenge will be welcome :)

Instead of trying to simplify this function mathematically, I simplified the algorithm in Python. As suggested by #LambdaFairy, I implemented memoization in the getNumberOfZombits(time) function. This optimization sped up the function a lot.
Then, I passed to the next step, of trying to see what was the input to that number of rabbits. I had analyzed the function before, by watching its plot, and I knew the even numbers got higher outputs first and only after some time the odd numbers got to the same level. As we want the highest input for that output, I first needed to search in the even numbers and then in the odd numbers.
As you can see, the odd numbers take always more time than the even to reach the same output.
The problem is that we could not search for the numbers increasing 1 each time (it was too slow). What I did to solve that was to implement a binary search-like algorithm. First, I would search the even numbers (with the binary search like algorithm) until I found one answer or I had no more numbers to search. Then, I did the same to the odd numbers (again, with the binary search like algorithm) and if an answer was found, I replaced whatever I had before with it (as it was necessarily bigger than the previous answer).
I have the source code I used to solve this, so if anyone needs it I don't mind sharing it :)

The key to solving this puzzle was using a binary search.
As you can see from the sequence generators, they rely on a roughly n/2 recursion, so calculating R(N) takes about 2*log2(N) recursive calls; and of course you need to do it for both the odd and the even.
Thats not too bad, but you need to figure out where to search for the N which will give you the input. To do this, I first implemented a search for upper and lower bounds for N. I walked up N by powers of 2, until I had N and 2N that formed the lower and upper bounds respectively for each sequence (odd and even).
With these bounds, I could then do a binary search between them to quickly find the value of N, or its non-existence.

Python thinks Euler has identity issues (cmath returning funky results)

My code:
import math
import cmath
print "E^ln(-1)", cmath.exp(cmath.log(-1))
What it prints:
E^ln(-1) (-1+1.2246467991473532E-16j)
What it should print:
-1
(For Reference, Google checking my calculation)
According to the documentation at python.org cmath.exp(x) returns e^(x), and cmath.log(x) returns ln (x), so unless I'm missing a semicolon or something , this is a pretty straightforward three line program.
When I test cmath.log(-1) it returns πi (technically 3.141592653589793j). Which is right. Euler's identity says e^(πi) = -1, yet Python says when I raise e^(πi), I get some kind of crazy talk (specifically -1+1.2246467991473532E-16j).
Why does Python hate me, and how do I appease it?
Is there a library to include to make it do math right, or a sacrifice I have to offer to van Rossum? Is this some kind of floating point precision issue perhaps?
The big problem I'm having is that the precision is off enough to have other values appear closer to 0 than actual zero in the final function (not shown), so boolean tests are worthless (i.e. if(x==0)) and so are local minimums, etc...
For example, in an iteration below:
X = 2 Y= (-2-1.4708141202500006E-15j)
X = 3 Y= -2.449293598294706E-15j
X = 4 Y= -2.204364238465236E-15j
X = 5 Y= -2.204364238465236E-15j
X = 6 Y= (-2-6.123233995736765E-16j)
X = 7 Y= -2.449293598294706E-15j
3 & 7 are both actually equal to zero, yet they appear to have the largest imaginary parts of the bunch, and 4 and 5 don't have their real parts at all.
Sorry for the tone. Very frustrated.

As you've already demonstrated, cmath.log(-1) doesn't return exactly i*pi. Of course, returning pi exactly is impossible as pi is an irrational number...
Now you raise e to the power of something that isn't exactly i*pi and you expect to get exactly -1. However, if cmath returned that, you would be getting an incorrect result. (After all, exp(i*pi+epsilon) shouldn't equal -1 -- Euler doesn't make that claim!).
For what it's worth, the result is very close to what you expect -- the real part is -1 with an imaginary part close to floating point precision.

It appears to be a rounding issue. While -1+1.22460635382e-16j is not a correct value, 1.22460635382e-16j is pretty close to zero. I don't know how you could fix this but a quick and dirty way could be rounding the number to a certain number of digits after the dot ( 14 maybe ? ).
Anything less than 10^-15 is normally zero. Computer calculations have a certain error that is often in that range. Floating point representations are representations, not exact values.

The problem is inherent to representing irrational numbers (like π) in finite space as floating points.
The best you can do is filter your result and set it to zero if its value is within a given range.
>>> tolerance = 1e-15
>>> def clean_complex(c):
... real,imag = c.real, c.imag
... if -tolerance < real < tolerance:
... real = 0
... if -tolerance < imag < tolerance:
... imag = 0
... return complex(real,imag)
...
>>> clean_complex( cmath.exp(cmath.log(-1)) )
(-1+0j)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating an s-curve based on data points - python

Related

Binary-like search

How to minimize occuring errors in Verlet Integration for orbital mechanics

Numerical solution of exponential equation using Python or other software

Solving recursive sequence

Python thinks Euler has identity issues (cmath returning funky results)

Categories

Resources