Suppose I have a system that is driven by a signal comprising 3 voltage levels (let's say -V1, 0, V1). I need to determine the composition of the signal that most accurately produces the desired output. The output is a single number that represents the current state of the system. The number of possible permutations for such a signal are too high to brute-force and find the global minimum i.e. exploring the entirety of the search space is impossible. However, I do have a model that simulates the system so I can still process several possible options. How can I find the best signal to produce the desired output (in other words, the signal that drives the system to the desired state)?
One method that I have right now involves producing a starting set (i.e. a small subset of the search space) of signals that align with a set of constraints, finding the signal that produces output closest to the desired output, and making modifications to this signal (i.e. fine tuning) in order to obtain the desired output. This final step is difficult for me, as I am manually doing this. One idea to automate this final step is to parametrize all possible modifications (for instance, parameter x1 = 1 adds a single -V1 'frame' to the signal, x1 = 2 adds two such frames, x1 = -1 removes a -V1 frame, and so on), and step through the set of possible modifications. But again, there's a lot of possibilities. To improve upon this, I explored the effect that modifications have on the system output. The effects of these modifications look somewhat predictable (the distributions of the changes in output they produce generally follow Gaussian distributions). But I'm not sure how to proceed from here. What models/schemes would you suggest I use? Can I use information from the distributions of changes produced by modifications to intelligently fine-tune the signal? How do I account for outliers (i.e. cases wherein the modification(s) to an initial signal produce a change in output that lies in the tail end of the distribution?
Edit: Forgot to mention, but the constraints on the signal would be length (the number of frames/steps in the signal must be less than or equal to a finite positive integer, N) and total potential (i.e. the sum of the voltages in the signal should equal an integer, V).
Related
I am trying to solve a physical problem by coupling a simulation software to Python. Basically, I need to find the values of length and diameter of each of the pipe sections in the picture below (line segment between any 2 black dots is a pipe section) such that fluid flow from point 0 reaches points 1-5 at the same time instant. I give some starting values for the length and diameter of each of the pipe sections and the simulation software solves to check if the fluid reaches the points 1-5 at the same time instant. If not, the lengths and diameters of the pipe section(s) need to be changed to ensure this. Flow not reaching points 1-5 at the same instant is known as flow imbalance, and ideally I need to reduce this imbalance to zero.
Now my question is - can I couple Python to the simulation software to suggest values of the length and diameter of the various pipe sections to ensure that flow reaches points 1-5 at the same time instant? I already know how to run the simulation software through a python script, and how to extract the flow imbalance result from the software. All I want to know is does a library/ function exist in Python that can iteratively suggest values for the length and diameter of pipe section(s) such that flow imbalance reduces after every iteration?
Please know that it is not possible to frame an objective function that will consider the length and diameter of the pipe section(s) and try to minimize or maximize it to eliminate flow imbalance. Running the software simulation is the only way to actually check this flow imbalance. I know that optimization libraries exist such as scipy.optimize, but AFAIK they work on an objective function. I could not find anything that would suggest values for the length and diameter of pipe sections depending on how large the flow imbalance is after every iteration.
So you can write a function
def imbalance(pipe_diameters):
times = get_pipe_times(pipe_diameters)
return times - np.mean(times)
Then you can use
from scipy.optimize import leastsq
x0 = uniform_diameter_pipes()
diameters = leastsq(imbalance, x0)
If the number of parameters is more than the number of outputs then you may have to use minimize as mentioned in the comments. In that case your imbalance must return a scalar.
def imbalance(pipe_diameters):
times = get_pipe_times(pipe_diameters)
return np.var(times) # calculate variance, could be other metric as well
I'm working on a panel method code at the moment. To keep us from being bogged down in the minutia, I won't show the code - this is a question about overall program structure.
Currently, I solve my system by:
Generating the corresponding rows of the A matrix and b vector in an explicit component for each boundary condition
Assembling the partial outputs into the full A, b.
Solving the linear system, Ax=b, using a LinearSystemComp.
Here's a (crude) diagram:
I would prefer to be able to do this by just writing one implicit component to represent each boundary condition, vectorising the inputs/outputs to represent multiple rows/cols in the matrix, then allowing openMDAO to solve for the x while driving the residual for each boundary condition to 0.
I've run into trouble trying to make this work, as each implicit component is underdetermined (more rows in the output vector x than the component output residuals; that is, A1.x - b1= R1, length(R1) < length(x). Essentially, I would like openMDAO to take each of these underdetermined implicit systems, and find the value of x that solves the determined full system - without needing to do all of the assembling stuff myself.
Something like this:
To try and make my goal clearer, I'll explain what I actually want from the perspective of my panel method. I'd like a component, let's say Influence, that computes the potential induced by a given panel at a given point in the panel's reference frame. I'd like to vectorise the input panels and points such that it can compute the influence coefficent of many panels on one point, of many points on one panel, or of many points on many panels.
I'd then like a system of implicit boundary conditions to find the correct value of mu to solve the system. These boundary conditions, again, should be able to be vectorised to compute the violation of the boundary condition at many points under the influence of many panels.
I get confused again at this part. Not every boundary condition will use the influence coefficient values - some, like the Kutta condition, are just enforced on the mu vector, e.g .
How would I implement this as an implicit component? It has no inputs, and doesn't output the full mu vector.
I appreciate that the question is rather long and rambling, but I'm pretty confused. To summarise:
How can I use openMDAO to solve multiple individually underdetermined (but combined, fully determined) implicit systems?
How can I use openMDAO to write an implicit component that takes no inputs and only uses a portion of the overall solution vector?
In the OpenMDAO docs there is a close analog to what you are trying to accomplish, with the node-voltage analysis tutorial. In that code, the balance comp is used to create an implicit relationship that is similar to what you're describing. Its singular on its own, but part of a larger group is a well defined system.
You'll need to find a way to build similar components for your model. Each "row" in your equation will be associated with one state variables (one entry in your x vector).
In the simplest case, each row (or set of rows) would have one input which is the associated row of the A matrix, and a second input which is ALL of the other values for x, and a final input which is the entry of the b vector (right hand side vector). Then you could evaluate the residual for that specific row, which would be the following
R['x_i'] = np.sum(A*x_full) - b
where x_full is the assembly of the full x-vector from the x_other input and the x_i state variable.
#########
Having proposed the above solution, I have to say that I don't think this is a particularly efficient way to build or solve this linear system. It is modular, and might give you some flexibility, but you're jumping through a lot of hoops to avoid doing some index-math, and shoving everything into a matrix.
Granted, the derivatives might be a bit easier in your design, because the matrix assembly is going to get handled "magically" by the connections you have to create between the various row-components. So maybe its worth the trade... but i would say you might be better of trying a more traditional coding approach and using JAX or some other AD code to make the derivatives easier.
With python I want to compare a simulated light curve with the real light curve. It should be mentioned that the measured data contain gaps and outliers and the time steps are not constant. The model, however, contains constant time steps.
In a first step I would like to compare with a statistical method how similar the two light curves are. Which method is best suited for this?
In a second step I would like to fit the model to my measurement data. However, the model data is not calculated in Python but in an independent software. Basically, the model data depends on four parameters, all of which are limited to a certain range, which I am currently feeding mannualy to the software (planned is automatic).
What is the best method to create a suitable fit?
A "Brute-Force-Fit" is currently an option that comes to my mind.
This link "https://imgur.com/a/zZ5xoqB" provides three different plots. The simulated lightcurve, the actual measurement and lastly both together. The simulation is not good, but by playing with the parameters one can get an acceptable result. Which means the phase and period are the same, magnitude is in the same order and even the specular flashes should occur at the same period.
If I understand this correctly, you're asking a more foundational question that could be better answered in https://datascience.stackexchange.com/, rather than something specific to Python.
That said, as a data science layperson, this may be a problem suited for gradient descent with a mean-square-error cost function. You initialize the parameters of the curve (possibly randomly), then calculate the square error at your known points.
Then you make tiny changes to each parameter in turn, and calculate how the cost function is affected. Then you change all the parameters (by a tiny amount) in the direction that decreases the cost function. Repeat this until the parameters stop changing.
(Note that this might trap you in a local minimum and not work.)
More information: https://towardsdatascience.com/implement-gradient-descent-in-python-9b93ed7108d1
Edit: I overlooked this part
The simulation is not good, but by playing with the parameters one can get an acceptable result. Which means the phase and period are the same, magnitude is in the same order and even the specular flashes should occur at the same period.
Is the simulated curve just a sum of sine waves, and are the parameters just phase/period/amplitude of each? In this case what you're looking for is the Fourier transform of your signal, which is very easy to calculate with numpy: https://docs.scipy.org/doc/scipy/reference/tutorial/fftpack.html
I have been using the SLSQP algorithm to run some MDO problems with ExplicitComponents only. Each component has a runtime of around 10 seconds and 60-100 input variables. Most of the input variables are static input variables that will remain constant during the entire optimization. The static input variables originate from an IndepVarComp. The ExplicitComponents are black boxes, so no information is available on the partials.
I noticed that when the Jacobian is calculated in the compute_totals(), the components are linearized with respect to all their input values. In the compute_approximations() a finite difference is calculated over all the input values, including the static input values. So, my question is: why is a finite difference calculation performed over these static input variables? As the values remain constant, I’m not sure why this information would be useful?
Furthermore, if I understand it correctly, the components are linearized to get the sub-Jacobians, which are then used to calculate the total Jacobian. However, is it possible to directly calculate a finite-difference over the entire group instead of linearizing each component? With the runtimes of my components and amount of input variables, it will take a long time to perform the linearization of each component. However, the optimization problem has only 3 design variables. So, if I could perform three finite difference calculations over the entire MDA to calculate the total Jacobian, the total runtime will decrease significantly.
To answer your questions in reverse order:
1) Can you FD over the entire model instead of each individual component? Yes!
You can set up FD over any group in your model, including the top-level group. Then the FD is taken across that group rather than across each component in it.
We call that computing a semi-total derivative, because in general you can select a sub-group in your model, in which case the FD is approximating a total-derivative across that group but that total-derivative is still effectively a partial-derivative for the overall model. hence semi-total derivative.
2) Why is a finite difference calculation performed over these static input variables?
In theory, you're correct that you don't really need partial derivatives of the inputs that can't change. As of OpenMDAO 2.4, we don't handle that situation automatically though, and we don't have plans to add that in the near future. However, the framework is only taking FD across the partials you tell it to. It sounds like you are declaring your partials like this:
self.declare_partials(of=['*'], wrt=['*'], method='fd')
So you're specifically asking the framework to compute all those partials. Instead, you could specify in the wrt argument only the inputs you know are actually changing. Of course, this is mathematically incorrect because there is a derivative wrt to the static-inputs. If someone later on connects something to those inputs and tries and optimization, they would get a wrong answer. But as long as your careful, you can specifically ask for only the partials you wanted from any component and simple leave the non-changing inputs as effectively 0.
I have a problem with a game I am making. I think I know the solution(or what solution to apply) but not sure how all the ‘pieces’ fit together.
How the game works:
(from How to approach number guessing game(with a twist) algorithm? )
users will be given items with a value(values change every day and the program is aware of the change in price). For example
Apple = 1
Pears = 2
Oranges = 3
They will then get a chance to choose any combo of them they like (i.e. 100 apples, 20 pears, and 1 oranges). The only output the computer gets is the total value(in this example, its currently $143). The computer will try to guess what they have. Which obviously it won’t be able to get correctly the first turn.
Value quantity(day1) value(day1)
Apple 1 100 100
Pears 2 20 40
Orange 3 1 3
Total 121 143
The next turn the user can modify their numbers but no more than 5% of the total quantity (or some other percent we may chose. I’ll use 5% for example.). The prices of fruit can change(at random) so the total value may change based on that also(for simplicity I am not changing fruit prices in this example). Using the above example, on day 2 of the game, the user returns a value of $152 and $164 on day 3. Here's an example.
quantity(day2) %change(day2) value(day2) quantity(day3) %change(day3) value(day3)
104 104 106 106
21 42 23 46
2 6 4 12
127 4.96% 152 133 4.72% 164
*(I hope the tables show up right, I had to manually space them so hopefully its not just doing it on my screen, if it doesn't work let me know and I'll try to upload a screenshot).
I am trying to see if I can figure out what the quantities are over time(assuming the user will have the patience to keep entering numbers). I know right now my only restriction is the total value cannot be more than 5% so I cannot be within 5% accuracy right now so the user will be entering it forever.
What I have done so far:
I have taken all the values of the fruit and total value of fruit basket that’s given to me and created a large table of all the possibilities. Once I have a list of all the possibilities I used graph theory and created nodes for each possible solution. I then create edges(links) between nodes from each day(for example day1 to day2) if its within 5% change. I then delete all nodes that do not have edges(links to other nodes), and as the user keeps playing I also delete entire paths when the path becomes a dead end.
This is great because it narrows the choices down, but now I’m stuck because I want to narrow these choices even more. I’ve been told this is a hidden markov problem but a trickier version because the states are changing(as you can see above new nodes are being added every turn and old/non-probable ones are being removed).
** if it helps, I got a amazing answer(with sample code) on a python implementation of the baum-welch model(its used to train the data) here: Example of implementation of Baum-Welch **
What I think needs to be done(this could be wrong):
Now that I narrowed the results down, I am basically trying to allow the program to try to predict the correct based the narrowed result base. I thought this was not possible but several people are suggesting this can be solved with a hidden markov model. I think I can run several iterations over the data(using a Baum-Welch model) until the probabilities stabilize(and should get better with more turns from the user).
The way hidden markov models are able to check spelling or handwriting and improve as they make errors(errors in this case is to pick a basket that is deleted upon the next turn as being improbable).
Two questions:
How do I figure out the transition and emission matrix if all states are at first equal? For example, as all states are equally likely something must be used to dedicate the probability of states changing. I was thinking of using the graph I made to weight the nodes with the highest number of edges as part of the calculation of transition/emission states? Does that make sense or is there a better approach?
How can I keep track of all the changes in states? As new baskets are added and old ones are removed, there becomes an issue of tracking the baskets. I though an Hierarchical Dirichlet Process hidden markov model(hdp-hmm) would be what I needed but not exactly sure how to apply it.
(sorry if I sound a bit frustrated..its a bit hard knowing a problem is solvable but not able to conceptually grasp what needs to be done).
As always, thanks for your time and any advice/suggestions would be greatly appreciated.
Like you've said, this problem can be described with a HMM. You are essentially interested in maintaining a distribution over latent, or hidden, states which would be the true quantities at each time point. However, it seems you are confusing the problem of learning the parameters for a HMM opposed to simply doing inference in a known HMM. You have the latter problem but propose employing a solution (Baum-Welch) designed to do the former. That is, you have the model already, you just have to use it.
Interestingly, if you go through coding a discrete HMM for your problem you get an algorithm very similar to what you describe in your graph-theory solution. The big difference is that your solution is tracking what is possible whereas a correct inference algorithm, like the Virterbi algorithm, will track what is likely. The difference is clear when there is overlap in the 5% range on a domain, that is, when multiple possible states could potentially transition to the same state. Your algorithm might add 2 edges to a point, but I doubt that when you compute the next day that has an effect (it should count twice, essentially).
Anyway, you could use the Viterbi algortihm, if you are only interested in the best guess at the most recent day I'll just give you a brief idea how you can just modify your graph-theory solution. Instead of maintaining edges between states maintain a fraction representing the probability that state is the correct one (this distribution is sometimes called the belief state). At each new day, propagate forward your belief state by incrementing each bucket by the probability of it's parent (instead of adding an edge your adding a floating point number). You also have to make sure your belief state is properly normalized (sums to 1) so just divide by its sum after each update. After that, you can weight each state by your observation, but since you don't have a noisy observation you can just go and set all the impossible states to being zero probability and then re-normalize. You now have a distribution over underlying quantities conditioned on your observations.
I'm skipping over a lot of statistical details here, just to give you the idea.
Edit (re: questions):
The answer to your question really depends on what you want, if you want only the distribution for the most recent day then you can get away with a one-pass algorithm like I've described. If, however, you want to have the correct distribution over the quantities at every single day you're going to have to do a backward pass as well. Hence, the aptly named forward-backward algorithm. I get the sense that since you are looking to go back a step and delete edges then you probably want the distribution for all days (unlike I originally assumed). Of course, you noticed there is information that can be used so that the "future can inform the past" so to speak, and this is exactly the reason why you need to do the backward pass as well, it's not really complicated you just have to run the exact same algorithm starting at the end of the chain. For a good overview check out Christopher Bishop's 6-piece tutorial on videolectures.net.
Because you mentioned adding/deleting edges let me just clarify the algorithm I described previously, keep in mind this is for a single forward pass. Let there be a total of N possible permutations of quantities, so you will have a belief state that is a sparse vector N elements long (called v_0). The first step you receive a observation of the sum, and you populate the vector by setting all the possible values to have probability 1.0, then re-normalize. The next step you create a new sparse vector (v_1) of all 0s, iterate over all non-zero entries in v_0 and increment (by the probability in v_0) all entries in v_1 that are within 5%. Then, zero out all the entries in v_1 that are not possible according to the new observation, then re-normalize v_1 and throw away v_0. repeat forever, v_1 will always be the correct distribution of possibilities.
By the way, things can get way more complex than this, if you have noisy observations or very large states or continuous states. For this reason it's pretty hard to read some of the literature on statistical inference; it's quite general.