Scipy optimize - different results when using built-in float and float128() - python

I have a complex function which includes very (very) large numbers, and i optimize the function with scipy.minimize.
A long time ago when i implemented the function i used numpy.float128() numbers, because i thought it can handle big numbers better.
However i attended a course, and learned that python ints (and floats i guess) can be arbitrary large.
I changed my code to use simple integers, (changed the initialization from a = np.float128() to a = 0 ) and surprisingly the very same function has a different optimum if i use a = 0 and a = np.float128, If i run the minimization with f.e. a = np.float128() 10 times, i get the same results. I use SLSQP method for optimization with bounds.
The code is complex, and i think it is not required to answer my question, but in case needed i can provide it.
So how can this happen? Which type should i use? Is this some kind of a precision error?

Related

Efficient way to compute the confluent Hypergeometric function for large arrays (~ 10^8 points) with complex parameters

I am working on a project related to gravitational lensing, for which I need to evaluate the confluent hypergeometric function 1F1(a,b,z) for an array z of length ~ 10^8 complex points, a = 1+0.48j and b = 1. I am looking for an efficient way to evaluate this on large array sizes. The scipy implementation is fast but does not accept complex arguments for a and b.
mpmath seems to be the best way to calculate 1F1 for complex parameters but mpmath.hyp1f1 does not accept array values. The best workaround I found for this was to use np.vectorize or np.frompyfunc to allow passing a NumPy array as a parameter. However, this is extremely slow and would take days to execute (even with gmpy2 installed). I assume this is because mpmath functions are always slow on large array sizes.
a nonpython implementation would be fine as well, as long as I can somehow save the result on disk and read it into my python code. I have seen some implementations (for example https://www.math.ucla.edu/~mason/research/pearson_final.pdf) which could possibly work but I'm not sure.
Another possible way would be to interpolate the function
(consecutive points in my input array are extremely close) but I'm not sure what would be the best way to do that.
Thanks!
I was having a very similar problem than you have.
I figured out that the mpmath package has a "hidden" set of function with (only) float precision, which one can access by writing fp. upfront. This does not exist for hyp1f1 but for the more general hyper. Meaning there is a fp.hyper in the mpmath package which is with fp.hyper([a],[b],z) equivalent to hyper1f1(a,b,z), but is a lot faster.
If you vectorize this with np.vectorize this should make your calculation substansially faster.
Disclaimer: I got an error message saying that some complex value is converted to real by dropping the imaginary part when evaluating this, but so far the results i have gotten seem sensible and compatible to the hyper1f1(a,b,z) values.
Added: It seems that fp.hyper does not like getting numpy datatypes even if they are scalars, as in the case of a,b,z beeing numpy scalars (for example one element of an numpy array) it will simply return 1 without giving an error message independent of the actual input. If you use np.vectorize however everything should be fine.
Eitherway: Use at own risc.

What is the Precision of SymPy's 'expand' Function?

I am trying to expand a function of the form (X + Y + Z) ^ N where N is sufficiently large so that the expanded product will contain terms with coefficients much greater than 2 ^ 64; for the sake of this discussion let's just say that N is greater than 200. This is an issue because I am hoping to do an analysis of the expanded form of this function, and this analysis requires exact precision for all of the terms and their coefficients.
To expand the function I am using the Python module SymPy, which has seemed very promising thus far and been able to expand functions where N is > 150 in a relatively short amount of time. My concern though is that after looking through some of the expanded functions, I am seeing coefficients with more trailing zeroes than I might expect. I know that I can run everything through mpmath for my analysis after the function has been expanded, but as of now, I am unsure as to whether or not some of the larger coefficients are even exactly correct in the first place.
Under the documentation for SymPy's expand function, there is no clarification of how precise the coefficients of the expansion are when working with very large numbers. I know for a fact that SymPy uses the mpmath module for some of its functions, so I know that it is capable of arbitrary precision, I just don't know if arbitrary precision explicitly applies to this case.
I know that I could also confirm if the expand function is arbitrarily precise or not by summing all of the coefficients of a given function and checking whether or not that sum is equal to N, but I'd rather not spend a few hours coding all the necessary pieces to make that assessment, only to find out that expand is imprecise.
If anyone has any suggestions for easier ways to confirm the precision of expand, then I would appreciate that if direct confirmation of its precision cannot be given.
Although PR 18960 has not yet been merged, you can affirm there that the coefficients are correct:
>>> multinomial(15,16,14)
50151543548788717200
>>> ((x+y+z)**(15+16+14)).expand().coeff(x**15*y**16*z**14)
50151543548788717200
>>> _ > 2**64
True
Since Python supports unlimited integers and the coefficients are integers, I don't know any reason that they would not be accurate.

Automatically generate data for unit testing in Python

I have a module to test, module includes a serie of functions / simple classes.
Wondering if there any attempts(ie package) to generate automatically:
1) Generate Python code from initial Python file containing function definition.
2) This code list of call to the functions with random/parametric data as parameters.
It is technically feasible by using inspect and python meta classes,
usually limited to numerical type functions....(numpy array).
Because string (ie url input) would be impossible (only parametrized...).
EDIT: By random, it means obviously "parametric random".
Suppose we have
def f(x1,x2,x3)
For all xi of f
if type(xi) = array1D ->
Do those tests: empty array, zeros array, negative array(random),
positivearray(random), high values, low values, integer array, real
number array, ordered array, equal space array,.....
if type(xi)=int -> test zero, 1, 2,3,4, randomValues, Negative
Do people think such project is possible using inspect and meta class? (limited to numpy/numerical items).
Suppose you have a very large library..., things can be done in background.
You might be thinking of fuzz testing, where a bunch of garbage data is submitted to a function to see if anything makes it behave badly. It sounds like the Hypothesis library will let you generate different test cases based on some parameters.
I spent searching, it seems this kind of project does not really exist (to my knowledge):
Technically, this is a mix of packages (issues):
Hypothese : data generation for input, running the code with crash/error.
(without the invariant part of Hypothese)
Jedi: Static analysis of code/Inference of the type
Type inference is a difficult issue in Python (in general)
implementing type inference
If type is num/array of num:
Boundary exists/ typical usage is clearly defined
If type is string: Inference is pretty difficult without human guessing.
Same for others, Context guessing is important

How could I enhance the speed of this gamma function and complex numbers - related code?

Context: I am trying to make a very simple gamma function fractal plotting using Python and sympy, initially a very simple version to understand how it works (two mapping colors based on the value of counter=0 or 1).
Basically the code (below) calls to the gamma function, and then makes some complex number comparisons: just checks that the complex number "nextcomplex=gamma(mycomplex)" is closer to "1+0i" than the initial "mycomplex" complex number. The final algorithm to make the fractal is more elaborated than that, but the basic calculations are like those ones, so I would need to enhance the speed of that simple code.
For small intervals it works fine and I can plot the values, but is very slow for big intervals, I am running it right now and it is more than 1 hour and still running for an total of test_limitn x test_limitm=1000x1000 elements.
(for instance up to 100x100 goes fine and I can plot the values and see a very basic fractal)
My question is: how could I enhance the code to make it faster? (e.g. other Python libraries, or there are other functions much better to do the comparisons, etc.)
from sympy import gamma,I,re,im,zoo
test_limitn = 1000
test_limitm = 1000
for m in range(-test_limitm,test_limitm):
for n in range(-test_limitn, test_limitn):
counter = 0
mycomplex = m+(n*I)
nextcomplex = gamma(mycomplex).evalf(1)
if mycomplex!=zoo and nextcomplex!=zoo:
absrenextcomplex = re(nextcomplex)
absimnextcomplex = abs(im(nextcomplex))
if (abs(n) > absimnextcomplex) and (abs(1-m) > abs(1-absrenextcomplex)):
counter = 1
Any hint is very welcomed, thank you!
If you are only doing things numerically, you will be much better off using a numerical library like NumPy. SymPy is designed for symbolic calculations, and although it can perform numeric calculations, it isn't very fast at it.
Beyond that, numba may be able to improve the performance of your loops.

Is there a library for programmatic manipulation of Big-O complexities?

I'm interested in programming languages that can reason about their own time complexity. To this end, it would be quite useful to have some way of representing time complexity programmatically, which would allow me to do things like:
f_time = O(n)
g_time = O(n^2)
h_time = O(sqrt(n))
fastest_asymptotically = min(f_time, g_time, h_time) # = h_time
total_time = f_time.inside(g_time).followed_by(h_time) # = O(n^3)
I'm using Python at the moment, but I'm not particularly tied to a language. I've experimented with sympy, but I haven't been able to find what I need out of the box there.
Is there a library that provides this capability? If not, is there a simple way to do the above with a symbolic math library?
EDIT: I wrote a simple library following #Patrick87's advice and it seems to work. I'm still interested if there are other solutions for this problem, though.
SymPy currently only supports the expansion at 0 (you can simulate other finite points by performing a shift). It doesn't support the expansion at infinity, which is what is used in algorithmic analysis.
But it would be a good base package for it, and if you implement it, we would gladly accept a patch (nb: I am a SymPy core developer).
Be aware that in general the problem is tricky, especially if you have two variables, or even symbolic constants. It's also tricky if you want to support oscilitory functions. EDIT: If you are interested in oscillating functions, this SymPy mailing list discussion gives some interesting papers.
EDIT 2: And I would recommend against trying to build this yourself from scratch, without the use of a computer algebra system. You will end up having to write your own computer algebra system, which is a lot of work, and even more work if you want to do it right and not have it be slow. There are already tons of systems already existing, including many that can act as libraries for code to be built on top of them (such as SymPy).
Actually you are building/finding a Expression Simplifier which can deal with:
+ (in your terms: followed_by)
***** (in your terms: inside)
^, log, ! (to represent the complexity)
variable (like n,m)
constant number (like that in 2^n)
For example, as you given f_time.inside(g_time).followed_by(h_time), It could be an expression like:
n*(n^2)+(n^(1/2))
, and you are expecting an processer to make it output as:n^3.
So in general speaking, you might want to use a common expression simplifier (if you want it to be interesting, go to check how Mathemetica does it) to get a simplified expression like n^3+n^(1/2), and then you need an additional processor to choose the term with highest complexity from the expression and get rid of the other terms. That would be easy, just use a table to define the complexity order of each kind of symbol.
Please note that in this case, the expressions are just symbol, you should write it as something like string (For your example: f_time = "O(n)"), not as functions.
If you're only working with big-O notation and are interested in whether one function grows more or less quickly than another, asymptotically speaking...
Given functions f and g
Compute the limit as n goes to infinity of f(n)/g(n) using a computer algebra package
If the limit diverges to +infinity, then f > g - in the sense that g = O(f), but f != O(g).
If the limit diverges to 0, then g < f.
If the limit converges to a finite number, then f = g (in the sense that f = O(g) and g = O(f))
If the limit is undefined... beats me!

Categories

Resources