Optimizing a physical problem by coupling Python to simulation

Optimizing a physical problem by coupling Python to simulation - python

I am trying to solve a physical problem by coupling a simulation software to Python. Basically, I need to find the values of length and diameter of each of the pipe sections in the picture below (line segment between any 2 black dots is a pipe section) such that fluid flow from point 0 reaches points 1-5 at the same time instant. I give some starting values for the length and diameter of each of the pipe sections and the simulation software solves to check if the fluid reaches the points 1-5 at the same time instant. If not, the lengths and diameters of the pipe section(s) need to be changed to ensure this. Flow not reaching points 1-5 at the same instant is known as flow imbalance, and ideally I need to reduce this imbalance to zero.
Now my question is - can I couple Python to the simulation software to suggest values of the length and diameter of the various pipe sections to ensure that flow reaches points 1-5 at the same time instant? I already know how to run the simulation software through a python script, and how to extract the flow imbalance result from the software. All I want to know is does a library/ function exist in Python that can iteratively suggest values for the length and diameter of pipe section(s) such that flow imbalance reduces after every iteration?
Please know that it is not possible to frame an objective function that will consider the length and diameter of the pipe section(s) and try to minimize or maximize it to eliminate flow imbalance. Running the software simulation is the only way to actually check this flow imbalance. I know that optimization libraries exist such as scipy.optimize, but AFAIK they work on an objective function. I could not find anything that would suggest values for the length and diameter of pipe sections depending on how large the flow imbalance is after every iteration.

So you can write a function
def imbalance(pipe_diameters):
times = get_pipe_times(pipe_diameters)
return times - np.mean(times)
Then you can use
from scipy.optimize import leastsq
x0 = uniform_diameter_pipes()
diameters = leastsq(imbalance, x0)
If the number of parameters is more than the number of outputs then you may have to use minimize as mentioned in the comments. In that case your imbalance must return a scalar.
def imbalance(pipe_diameters):
times = get_pipe_times(pipe_diameters)
return np.var(times) # calculate variance, could be other metric as well

Related

How can I statistically compare a lightcurve data set with the simulated lightcurve?

With python I want to compare a simulated light curve with the real light curve. It should be mentioned that the measured data contain gaps and outliers and the time steps are not constant. The model, however, contains constant time steps.
In a first step I would like to compare with a statistical method how similar the two light curves are. Which method is best suited for this?
In a second step I would like to fit the model to my measurement data. However, the model data is not calculated in Python but in an independent software. Basically, the model data depends on four parameters, all of which are limited to a certain range, which I am currently feeding mannualy to the software (planned is automatic).
What is the best method to create a suitable fit?
A "Brute-Force-Fit" is currently an option that comes to my mind.
This link "https://imgur.com/a/zZ5xoqB" provides three different plots. The simulated lightcurve, the actual measurement and lastly both together. The simulation is not good, but by playing with the parameters one can get an acceptable result. Which means the phase and period are the same, magnitude is in the same order and even the specular flashes should occur at the same period.

If I understand this correctly, you're asking a more foundational question that could be better answered in https://datascience.stackexchange.com/, rather than something specific to Python.
That said, as a data science layperson, this may be a problem suited for gradient descent with a mean-square-error cost function. You initialize the parameters of the curve (possibly randomly), then calculate the square error at your known points.
Then you make tiny changes to each parameter in turn, and calculate how the cost function is affected. Then you change all the parameters (by a tiny amount) in the direction that decreases the cost function. Repeat this until the parameters stop changing.
(Note that this might trap you in a local minimum and not work.)
More information: https://towardsdatascience.com/implement-gradient-descent-in-python-9b93ed7108d1
Edit: I overlooked this part
The simulation is not good, but by playing with the parameters one can get an acceptable result. Which means the phase and period are the same, magnitude is in the same order and even the specular flashes should occur at the same period.
Is the simulated curve just a sum of sine waves, and are the parameters just phase/period/amplitude of each? In this case what you're looking for is the Fourier transform of your signal, which is very easy to calculate with numpy: https://docs.scipy.org/doc/scipy/reference/tutorial/fftpack.html

Calculating a trajectory between two known points and an IMU

Query:
I want to estimate the trajectory of a person wearing an IMU between point a and point b. I know the exact location of point a and point b in an x,y,z space and the time it takes the person to walk between the points.
Is it possible to reconstruct the trajectory of the person moving from point a to point b using the data from an IMU and the time?

This question is too broad for SO. You could write a PhD thesis answering it, and I know people who have.
However, yes, it is theoretically possible.
However, there are a few things you'll have to deal with:
Your system is going to discretize time on some level. The result is that your estimate of position will be non-smooth. Increasing sampling rates is one way to address this, but this frequently increases the noise of the measurement.
Possible paths are non-unique. Knowing the time it takes to travel from a-b constrains slightly the information from the IMUs, but you are still left with an infinite family of possible routes between the two. Since you mention that you're considering a person walking between two points with z-components, perhaps you can constrain the route using knowledge of topography and roads?
IMUs function by integrating accelerations to velocities and velocities to positions. If the accelerations have measurement errors, and they always do, then the error in your estimate of the position will grow over time. The longer you run the system for, the more the results will diverge. However, if you're able to use roads/topography as a constraint, you may be able to restart the integration from known points in space; that is, if you can detect 90 degree turns on a street grid, each turn gives you the opportunity to tie the integrator back to a feasible initial condition.
Given the above, perhaps the most important question you have to ask yourself is how much error you can tolerate in your path reconstruction. Low-error estimates are going to require better (i.e. more expensive) sensors, higher sampling rates, and higher-order integrators.

Algorithm to smooth out noise in a running system while converging on an initially unknown constant

I'm trying to smooth out noise in a slightly unusual situation. There's probably a common algorithm to solve my problem, but I don't know what it is.
I'm currently building a robotic telescope mount. To track the movement of the sky, the mount takes a photo of the sky once per second and tracks changes in the X, Y, and rotation of the stars it can see.
If I just use the raw measurements to track rotation, the output is choppy and noisy, like this:
Guiding with raw rotation measurements:
If I use a lowpass filter, the mount overshoots and never completely settles down. A lower Beta value helps with this, but then the corrections are too slow and error accumulates.
Guiding with lowpass filter:
(In both graphs, purple is the difference between sky and mount rotation, red is the corrective rotations made by the mount.)
A moving average had the same problems as the lowpass filter.
More information about the problem:
For a given area of the sky, the rotation of the stars will be constant. However, we don't know where we are and the measurement of sky rotation is very noisy due to atmospheric jitter, so the algorithm has to work its way towards this initially unknown constant value while guiding.
The mount can move as far as necessary in one second, and has its own control system. So I don't think this is a PID loop control system problem.
It's OK to guide badly (or not at all) for the first 30 seconds or so.
I wrote a small Python program to simulate the problem - might as well include it here, I suppose. This one is currently using a lowpass filter.
#!/usr/bin/env python3
import random
import matplotlib.pyplot as plt
ROTATION_CONSTANT = 0.1
TIME_WINDOW = 300
skyRotation = 0
mountRotation = 0
error = 0
errorList = []
rotationList = []
measurementList = []
smoothData = 0
LPF_Beta = 0.08
for step in range(TIME_WINDOW):
skyRotation += ROTATION_CONSTANT
randomNoise = random.random() - random.random()
rotationMeasurement = skyRotation - mountRotation + randomNoise
# Lowpass filter
smoothData = smoothData - (LPF_Beta * (smoothData - rotationMeasurement));
mountRotation += smoothData
rotationList.append(smoothData)
errorList.append(skyRotation - mountRotation)
measurementList.append(rotationMeasurement)
plt.plot([0, TIME_WINDOW], [ROTATION_CONSTANT, ROTATION_CONSTANT], color='black', linestyle='-', linewidth=2)
plt.plot(errorList, color="purple")
plt.plot(rotationList, color="red")
plt.plot(measurementList, color="blue", alpha=0.2)
plt.axis([0, TIME_WINDOW, -1.5, 1.5])
plt.xlabel("Time (seconds)")
plt.ylabel("Rotation (degrees)")
plt.show()
If anyone knows how to make this converge smoothly (or could recommend relevant learning resources), I would be most grateful. I'd be happy to read up on the topic but so far haven't figured out what to look for!

I would first of all try and do this the easy way by making your control outputs the result of a PID and then tuning the PID as described at e.g. https://robotics.stackexchange.com/questions/167/what-are-good-strategies-for-tuning-pid-loops or from your favourite web search.
Most other approaches require you to have an accurate model of the situation, including the response of the hardware under control to your control inputs, so your next step might be experiments to try and work this out, e.g. by attempting to work out the response to simple test inputs, such as an impulse or a step. Once you have a simulator you can, at the very least, tune parameters for proposed approaches more quickly and safely on the simulator than on the real hardware.
If your simulator is accurate, and if you are seeing more problems in the first 30 seconds than afterwards, I suggest using a Kalman filter to estimate the current error, and then sending in the control that (according to the model that you have constructed) will minimise the mean squared error between the time the control is acted upon and the time of the next observation. Using a Kalman filter will at least take account of the increased observational error when the system starts up.
Warning: the above use of the Kalman filter is myopic, and will fail dramatically in some situations where there is something corresponding to momentum: it will over-correct and end up swinging wildly from one extreme to another. Better use of the Kalman filter results would be to compute a number of control inputs, minimizing the predicted error at the end of this sequence of inputs (e.g. with dynamic programming) and then revisit the problem after the first control input has been executed. In the simple example where I found over-correction you can get stable behavior if you calculate the single control action that minimizes the error if sustained for two time periods, but revisit the problem and recalculate the control action at the end of one time period. YMMV.
If that doesn't work, perhaps it is time to take your accurate simulation, linearize it to get differential equations, and apply classical control theory. If it won't linearize reasonably over the full range, you could try breaking that range down, perhaps using different strategies for large and small errors.
Such (little) experience as I have from control loops suggests that it is extremely important to minimize the delay and jitter in the loop between the sensor sensing and the control actuating. If there is any unnecessary source of jitter or delay between input and control forget the control theory while you get that fixed.

Sum of multiple distributions

Background
I try to estimate the potential energy supply within a geographical area using spatially explicit data. For this purpose, I build a Bayesian network (HydeNet package) and attached it to a raster stack in R. The Bayesian network model reads the input data (e.g resource supply, conversion efficiency) of each cell location from the raster stack and computes the corresponding energy supply (MCMC simulations). As a result I obtain a new raste layer with a specific probability distribution of the expected energy supply for each raster cell.
However, I am equally interested in the total energy supply within the study area. That means I need to aggregate (sum) the potential supply of all the raster cells in order to get the overall supply potential within the area.
Click here for visual example
Research
The mathematical operation I want to do is called convolution. R provides a corresponding function called convolve that makes use of the Fast Fourrier Transfomration.
The examples I found so far (e.g. example 1, 2) were limited to the addition of two distributions at a time. However, I would like to sum-up multiple distributions (thousands, millions).
Question
How can I sum-up (convolve) multiple probabilty distributions?
I have up to 18,000,000 probability distributions. Thus the computation efficiency will certainly be an big issue.
Further, I am mainly interested in a solution in R, but other solutions (notably Python) are appreciated too.

I don't know if convolving multiple distributions at once would result in a speed increase. Wouldn't somthing like a123 = convolve(a1, a2, a3) behind the scenes simplify to a12 = convolve(a1, a2); a123 = convolve(a12, a30)?. Regardless, in R what you could try is using the foreach package and do all convolutions in parallel. on a quad core that would speed up the calculations (theoretically) by a factor 4. If you really want more speed you could try to use the OpenCL package to see if you can do these calculations parallel on a GPU, but this is programmingwise not easy to get into. If I were you I would focus more on these kind of solutions than trying to speed up functions that do convolutions.

Line fit algorithm/package to use in R that will satisfy certain criteria

I have a specific problem where I need a equation fit to a set of data where I can extract a general function (can be complex) that will give me a Y value that is very close to the data as possible-- as if I used the data as a lookup table.
I have been reading up on various approaches and Multivariate adaptive regression splines [MARS] seemed like an excellent candidate. The problem I have is that it is not detecting/fitting a hinge at a very important segment of the data.
I'm primarily using R with the Earth package, with the intention to put an equation in to Excel. I can use other languages or packages if it will give me the results I need.
PROBLEM:
At the low end of my data I have a small set of values that are an important lower bound that need to have a hinge or knot placed.
The rest of the data should have automatic hinge/knot detection.
Example:
X Y
0 130
1 130
10000 130
rest of X past 10000 with increasing Y's at various rates.
It will average the 0 through 10,000 range into the increasing Y values where if I
predict(model, 5000) I may get say 150 as a result. The line needs a flat linear segment, then hinge at 10000. This lack of a hinge causes all the high values of X to be very accurate in the MARS model output but the low values of X to diverge significantly from the base data.
I would rather not manually place this as the lower end may change and I would like a generalized approach.
Does anyone know of an approach similar to MARS that can provide
Automatic knot/hinge detection
Output is an equation I could place in to Excel
If the automatic detection is failing on an important section the ability to manually specify a point to hinge on.
The MARS approach is working for all the other breakpoints in the data but because of the limited "range" of the lower bound it doesn't place a hinge there even with pruning turned off.
Does anyone know of a better approach?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.