How to code this math integrate? - python

My math skills are really poor.
I'm trying to handle accelerometer data (tri-axis) with Python
I need to compute the norm
That is easy I made something like that:
math.sqrt(x_value**2 + y_value**2 + z_value**2)
but now, I have to compute the integration of that:
And for that I have no clue..
Can somebody help me on that?
Edit: Adding more info due to the negative votes (??)
I know there are tools to make integrations in python, but this one is not with edges (there are no limits in the formula) So that, I don't understand how to make it works..

Integrating the modulus of the acceleration will lead you nowhere. You must integrate all three components separately to get the components of the speed (and integrate them a second time to get the position vector).
To perform the numerical integration, use the Simpson rule incrementally. Anyway, as the number of points must be even, you will only get every other value of the speed.

Related

weighted central tendency in python

I have a list, like the one below:
[1,1,1,1,1,1,1,1,1,10]
I want to calculate the list average, however, arithmetic mean would not be an answer. I want my result to be skewed toward 10 because it is assumed to have much more weight. Geometric mean is not also an option. I wanted to know if there are other measures of central tendency that can be calculated in python using predefined functions. I am preparing my data for a neural network and I shall use various weights to see which one performs better. That is why I am looking for predefined functions so I can change their parameters to create multiple databases for my Neural Network.
Thanks in advance
P.s. The priority is to find the measure. Applying it in python comes next.
From my deep learning practice I don't quite agree that a higher weight should be desired.
Yet regarding your particular question, you can solve it
with straightforward power. High values become much larger, lower values much lower. After computing mean of powered values, compute a respective root, to return to the old scale.
And make sure you use an odd value for power, so that negative numbers retain sign.
import numpy as np
x = np.array([1,1,1,1,1,1,1,1,1,10])
power = 15
powered_mean = np.mean(np.power(x, power))
central_tendency = np.power(powered_mean, 1/power) # root
8.576958985908947

Plotting solution of complex equation with parameters with Python

This sounds to be something well explored, so I just need a reference.
I need to take a look (i.e. sketch the plot) to the solution of certain complex equations with several complex parameters and I would like to do that using Python/NumPy.
I have a certain number of complex functions of one complex variable depending on two parameters, and I need to add and compose them and finally to study the set of complex numbers such that the modulus of the above sum and composition is equal to a fixed positive real number.
For short: I have to plot the graph of the solutions of a certain complex equation, and see how does it move when parameters are touched.
Can somebody give a reference, please?
Thanks

How can I use implicit components to assemble a full system?

I'm working on a panel method code at the moment. To keep us from being bogged down in the minutia, I won't show the code - this is a question about overall program structure.
Currently, I solve my system by:
Generating the corresponding rows of the A matrix and b vector in an explicit component for each boundary condition
Assembling the partial outputs into the full A, b.
Solving the linear system, Ax=b, using a LinearSystemComp.
Here's a (crude) diagram:
I would prefer to be able to do this by just writing one implicit component to represent each boundary condition, vectorising the inputs/outputs to represent multiple rows/cols in the matrix, then allowing openMDAO to solve for the x while driving the residual for each boundary condition to 0.
I've run into trouble trying to make this work, as each implicit component is underdetermined (more rows in the output vector x than the component output residuals; that is, A1.x - b1= R1, length(R1) < length(x). Essentially, I would like openMDAO to take each of these underdetermined implicit systems, and find the value of x that solves the determined full system - without needing to do all of the assembling stuff myself.
Something like this:
To try and make my goal clearer, I'll explain what I actually want from the perspective of my panel method. I'd like a component, let's say Influence, that computes the potential induced by a given panel at a given point in the panel's reference frame. I'd like to vectorise the input panels and points such that it can compute the influence coefficent of many panels on one point, of many points on one panel, or of many points on many panels.
I'd then like a system of implicit boundary conditions to find the correct value of mu to solve the system. These boundary conditions, again, should be able to be vectorised to compute the violation of the boundary condition at many points under the influence of many panels.
I get confused again at this part. Not every boundary condition will use the influence coefficient values - some, like the Kutta condition, are just enforced on the mu vector, e.g .
How would I implement this as an implicit component? It has no inputs, and doesn't output the full mu vector.
I appreciate that the question is rather long and rambling, but I'm pretty confused. To summarise:
How can I use openMDAO to solve multiple individually underdetermined (but combined, fully determined) implicit systems?
How can I use openMDAO to write an implicit component that takes no inputs and only uses a portion of the overall solution vector?
In the OpenMDAO docs there is a close analog to what you are trying to accomplish, with the node-voltage analysis tutorial. In that code, the balance comp is used to create an implicit relationship that is similar to what you're describing. Its singular on its own, but part of a larger group is a well defined system.
You'll need to find a way to build similar components for your model. Each "row" in your equation will be associated with one state variables (one entry in your x vector).
In the simplest case, each row (or set of rows) would have one input which is the associated row of the A matrix, and a second input which is ALL of the other values for x, and a final input which is the entry of the b vector (right hand side vector). Then you could evaluate the residual for that specific row, which would be the following
R['x_i'] = np.sum(A*x_full) - b
where x_full is the assembly of the full x-vector from the x_other input and the x_i state variable.
#########
Having proposed the above solution, I have to say that I don't think this is a particularly efficient way to build or solve this linear system. It is modular, and might give you some flexibility, but you're jumping through a lot of hoops to avoid doing some index-math, and shoving everything into a matrix.
Granted, the derivatives might be a bit easier in your design, because the matrix assembly is going to get handled "magically" by the connections you have to create between the various row-components. So maybe its worth the trade... but i would say you might be better of trying a more traditional coding approach and using JAX or some other AD code to make the derivatives easier.

Fitting multiple Lorentzians to data using scipy in Python3

Okay so I appreciate this will require a bit of patience but bear with me. I'm analysing some Raman spectra and have written the basis of a program to use Scipy curve_fit to fit multiple lorentzians to the peaks on my data. The trick is I have so much data that I want the program to automatically identify initial guesses for the lorentzians, rather than manually doing it. On the whole, the program gives it a good go (and might be of use to others with a similar use case with simpler data), but I don't know Scipy well enough to optimise the curve_fit enough to make it work on many different examples.
Code repo here: https://github.com/btjones-me/raman_spectroscopy
An example of it working well can be see in fig 1.
Part of the problem is my peak finding algorithm, which sometimes struggles to find the appropriate starting positions for each lorentzian. You can see this in fig 2.
The next problem is that, for some reason, curve_fit occasionally catastrophically diverges (my best guess is due to rounding errors). You can see this behaviour in fig 3.
Finally while I usually make good guesses with the height and x position of each lorentzian, I haven't found a good way of predicting the width, or FWHM of the curves. Predicting this might help curve_fit.
If anybody can help with either of these problems in any way I would appreciate it greatly. I'm open to any other methods or suggestions including additional third party libraries, so long as they improve upon the current implementation. Many thanks to anybody who attempts this one!
Here it is working exactly as I intend:
Below you can see the peak finding method has failed to identify all the peaks. There are many peak finding algorithms, but the one I use is Scipy's 'find_peaks_cwt()' (it's not usually this bad, this is an extreme case).
Here it's just totally way off. This happens fairly often and I can't really understand why, nor stop it from happening. Possibly it occurs when I tell it to find more/less peaks than are available in the spectra, but just a guess.
I've done this in Python 3.5.2. PS I know I won't be winning any medals for code layout, but as always comments on code style and better practices are welcomed.
I stumbled across this because I'm attempting to solve the exact same problem, here is my solution. For each peak, I only fit my lorentzian in the region of the domain + or - 1/2 the distance to the next closest peak. Here is my function that does breaks up the domain:
def get_local_indices(peak_indices):
#returns the array indices of the points closest to each peak (distance to next peak/2)
chunks = []
for i in range(len(peak_indices)):
peak_index = peak_indices[i]
if i>0 and i<len(peak_indices)-1:
distance = min(abs(peak_index-peak_indices[i+1]),abs(peak_indices[i-1]-peak_index))
elif i==0:
distance = abs(peak_index-peak_indices[i+1])
else:
distance = abs(peak_indices[i-1]-peak_index)
min_index = peak_index-int(distance/2)
max_index = peak_index+int(distance/2)
indices = np.arange(min_index,max_index)
chunks.append(indices)
return chunks
And here is a picture of the resulting graph, the dashed lines indicate which areas the lorentzians are fit in:
Lorentzian Fitting Plot

Parallel many dimensional optimization

I am building a script that generates input data [parameters] for another program to calculate. I would like to optimize the resulting data. Previously I have been using the numpy powell optimization. The psuedo code looks something like this.
def value(param):
run_program(param)
#Parse output
return value
scipy.optimize.fmin_powell(value,param)
This works great; however, it is incredibly slow as each iteration of the program can take days to run. What I would like to do is coarse grain parallelize this. So instead of running a single iteration at a time it would run (number of parameters)*2 at a time. For example:
Initial guess: param=[1,2,3,4,5]
#Modify guess by plus minus another matrix that is changeable at each iteration
jump=[1,1,1,1,1]
#Modify each variable plus/minus jump.
for num,a in enumerate(param):
new_param1=param[:]
new_param1[num]=new_param1[num]+jump[num]
run_program(new_param1)
new_param2=param[:]
new_param2[num]=new_param2[num]-jump[num]
run_program(new_param2)
#Wait until all programs are complete -> Parse Output
Output=[[value,param],...]
#Create new guess
#Repeat
Number of variable can range from 3-12 so something such as this could potentially speed up the code from taking a year down to a week. All variables are dependent on each other and I am only looking for local minima from the initial guess. I have started an implementation using hessian matrices; however, that is quite involved. Is there anything out there that either does this, is there a simpler way, or any suggestions to get started?
So the primary question is the following:
Is there an algorithm that takes a starting guess, generates multiple guesses, then uses those multiple guesses to create a new guess, and repeats until a threshold is found. Only analytic derivatives are available. What is a good way of going about this, is there something built already that does this, is there other options?
Thank you for your time.
As a small update I do have this working by calculating simple parabolas through the three points of each dimension and then using the minima as the next guess. This seems to work decently, but is not optimal. I am still looking for additional options.
Current best implementation is parallelizing the inner loop of powell's method.
Thank you everyone for your comments. Unfortunately it looks like there is simply not a concise answer to this particular problem. If I get around to implementing something that does this I will paste it here; however, as the project is not particularly important or the need of results pressing I will likely be content letting it take up a node for awhile.
I had the same problem while I was in the university, we had a fortran algorithm to calculate the efficiency of an engine based on a group of variables. At the time we use modeFRONTIER and if I recall correctly, none of the algorithms were able to generate multiple guesses.
The normal approach would be to have a DOE and there where some algorithms to generate the DOE to best fit your problem. After that we would run the single DOE entries parallely and an algorithm would "watch" the development of the optimizations showing the current best design.
Side note: If you don't have a cluster and needs more computing power HTCondor may help you.
Are derivatives of your goal function available? If yes, you can use gradient descent (old, slow but reliable) or conjugate gradient. If not, you can approximate the derivatives using finite differences and still use these methods. I think in general, if using finite difference approximations to the derivatives, you are much better off using conjugate gradients rather than Newton's method.
A more modern method is SPSA which is a stochastic method and doesn't require derivatives. SPSA requires much fewer evaluations of the goal function for the same rate of convergence than the finite difference approximation to conjugate gradients, for somewhat well-behaved problems.
There are two ways of estimating gradients, one easily parallelizable, one not:
around a single point, e.g. (f( x + h directioni ) - f(x)) / h;
this is easily parallelizable up to Ndim
"walking" gradient: walk from x0 in direction e0 to x1,
then from x1 in direction e1 to x2 ...;
this is sequential.
Minimizers that use gradients are highly developed, powerful, converge quadratically (on smooth enough functions).
The user-supplied gradient function
can of course be a parallel-gradient-estimator.
A few minimizers use "walking" gradients, among them Powell's method,
see Numerical Recipes p. 509.
So I'm confused: how do you parallelize its inner loop ?
I'd suggest scipy fmin_tnc
with a parallel-gradient-estimator, maybe using central, not one-sided, differences.
(Fwiw,
this
compares some of the scipy no-derivative optimizers on two 10-d functions; ymmv.)
I think what you want to do is use the threading capabilities built-in python.
Provided you your working function has more or less the same run-time whatever the params, it would be efficient.
Create 8 threads in a pool, run 8 instances of your function, get 8 result, run your optimisation algo to change the params with 8 results, repeat.... profit ?
If I haven't gotten wrong what you are asking, you are trying to minimize your function one parameter at the time.
you can obtain it by creating a set of function of a single argument, where for each function you freeze all the arguments except one.
Then you go on a loop optimizing each variable and updating the partial solution.
This method can speed up by a great deal function of many parameters where the energy landscape is not too complex (the dependency between the parameters is not too strong).
given a function
energy(*args) -> value
you create the guess and the function:
guess = [1,1,1,1]
funcs = [ lambda x,i=i: energy( guess[:i]+[x]+guess[i+1:] ) for i in range(len(guess)) ]
than you put them in a while cycle for the optimization
while convergence_condition:
for func in funcs:
optimize fot func
update the guess
check for convergence
This is a very simple yet effective method of simplify your minimization task. I can't really recall how this method is called, but A close look to the wikipedia entry on minimization should do the trick.
You could do parallel at two parts: 1) parallel the calculation of single iteration or 2) parallel start N initial guessing.
On 2) you need a job controller to control the N initial guess discovery threads.
Please add an extra output on your program: "lower bound" that indicates the output values of current input parameter's decents wont lower than this lower bound.
The initial N guessing thread can compete with each other; if any one thread's lower bound is higher than existing thread's current value, then this thread can be dropped by your job controller.
Parallelizing local optimizers is intrinsically limited: they start from a single initial point and try to work downhill, so later points depend on the values of previous evaluations. Nevertheless there are some avenues where a modest amount of parallelization can be added.
As another answer points out, if you need to evaluate your derivative using a finite-difference method, preferably with an adaptive step size, this may require many function evaluations, but the derivative with respect to each variable may be independent; you could maybe get a speedup by a factor of twice the number of dimensions of your problem. If you've got more processors than you know what to do with, you can use higher-order-accurate gradient formulae that require more (parallel) evaluations.
Some algorithms, at certain stages, use finite differences to estimate the Hessian matrix; this requires about half the square of the number of dimensions of your matrix, and all can be done in parallel.
Some algorithms may also be able to use more parallelism at a modest algorithmic cost. For example, quasi-Newton methods try to build an approximation of the Hessian matrix, often updating this by evaluating a gradient. They then take a step towards the minimum and evaluate a new gradient to update the Hessian. If you've got enough processors so that evaluating a Hessian is as fast as evaluating the function once, you could probably improve these by evaluating the Hessian at every step.
As far as implementations go, I'm afraid you're somewhat out of luck. There are a number of clever and/or well-tested implementations out there, but they're all, as far as I know, single-threaded. Your best bet is to use an algorithm that requires a gradient and compute your own in parallel. It's not that hard to write an adaptive one that runs in parallel and chooses sensible step sizes for its numerical derivatives.

Categories

Resources