In an effort to improve my procedural map generation I've been learning more about how generating noise actually works.
With that in mind I've been doing a Python adaptation of this tutorial series about noise and noise derivatives. So I thought I'd build a node/module system based on ANL and libnoise that might at least be useful to someone else when I'm done.
I've been translating this Javascript version of libnoise, as I've used it previously and was familiar with it, into Python and adapting it for 1D and 4D noise (in addition to the 2D and 3D it already did) and derivatives.
The derivative adding, subtracting and multiplying used in the original tutorial cover a lot of the module functions, but I've come to the more complicated ones and I'm struggling to figure out how I should be treating the derivatives.
I'm at the Blend module, which takes three different noise inputs and interpolates two of them with the third as the alpha / time value in the lerp function, as follows;
def get1D(self, x):
a = self.sourceModules[0].get1D(x)
b = self.sourceModules[1].get1D(x)
alpha = self.sourceModules[2].get1D(x)
return lerp(a, b, alpha)
And I'm a bit lost as to what to do with that. Should I just discard the derivatives for the noise and calculate a new one based on the interpolation? How would this work for higher dimensions where I have multiple derivatives for each axis?
Or do I interpolate the old noise derivatives into a new one?
The original also performs some kind of easing on the alpha noise variable, my understanding from working out the noise derivatives is that this would definitely need to have a derivative version, but should the derivative from that noise play any role in the final mix?
Related
I've been trying to understand how automatic differentiation (autodiff) works. There are several implementations of this that can be found in Tensorflow, PyTorch and other programs.
There are three aspects of automatic differentiation that currently seem vague to me.
The exact process used to calculate the gradients
How autodiff works with respect to inputs
How autodiff works with respect to a singular value as input
So far, it seems to roughly follow the following steps:
Break up original function into elementary operations (individual arithmetic operations, composition and function calls).
The elementary operations are combined to form a computational graph in such a way that the original function can be calculated using the computational graph.
The computational graph is executed for a certain input, and each operation is recorded
Walking through the recorded operations in reverse using the chain rule gives us the gradient.
First of all, is this a correct overview of the steps that are taken in automatic differentiation?
Secondly, how would the above process work for a derivative with respect to the inputs. For instance, a function would need a difference in the x value. Does that mean that the derivative can only be calculated after at least two different x values have been provided as the input? Or does it require multiple inputs at once (i.e. vector input) over which it can calculate a difference? And how does this compare when we calculate the gradient with respect to the model weights (i.e. as done in backpropagation).
Thirdly, how can we take the derivative of a singular value. Take, for instance, the following Python code where the derivative of is calculated:
x = tf.constant(3.0)
with tf.GradientTape() as tape:
tape.watch(x)
y = x**2
# dy = 2x * dx
dy_dx = tape.gradient(y, x)
print(dy_dx.numpy()) # prints: '6.0'
Since dx is the difference between several x inputs, would that not mean that dx = 0?
I found that this paper had a pretty good overview of the various modes of autodiff. As well as the differences as compared to numerical and symbolic differentiation. However, it did not bring a full understanding and I would still like to understand the autodiff process in context of these traditional differentiation techniques.
Rather than applying it practically, I would love to get a more theoretical understanding.
I had similar questions in my mind a few weeks ago until I started to code my own Automatic Differentiation package tensortrax in Python. It uses forward-mode AD with a hyper-dual number approach. I wrote a Readme (landing page of the repository, section Theory) with an example which could be of interest for you.
I think what you need to understand first is what is a derivative, many math textbooks could help you with that. The notation dx means an infinitesimal variation, so you not actually compute any difference, but do a symbolic operation on your function f that transforms it to a function f' also noted df/dx, which you then apply at any point where it is defined.
Regarding the algorithm used for automatic differentiation, you understood it right, the part that you seem to be missing is how the derivatives of elementary operations are computed and what do they mean, but it would be hard to do a crash course about that in a SO answer.
I am modeling electrical current through various structures with the help of FiPy. To do so, I solve Laplace's equation for the electrical potential. Then, I use Ohm's law to derive the field and with the help of the conductivity, I obtain the current density.
FiPy stores the potential as a cell-centered variable and its gradient as a face-centered variable which makes sense to me. I have two questions concerning face-centered variables:
If I have a two- or three-dimensional problem, FiPy computes the gradient in all directions (ddx, ddy, ddz). The gradient is a FaceVariable which is always defined on the face between two cell centers. For a structured (quadrilateral) grid, only one of the derivates should be greater than zero since for any face, the position of the two cell-centers involved should only differ in one coordinate. In my simulations however, it occurs frequently that more than one of the derivates (ddx, ddy, ddz) is greater than zero, even for a structured grid.
The manual gives the following explanation for the FaceGrad-Method:
Return gradient(phi) as a rank-1 FaceVariable using differencing for the normal direction(second-order gradient).
I do not see, how this differs from my understanding pointed out above.
What makes it even more problematic: Whenever "too many" derivates are included, current does not seem to be conserved, even in the simplest structures I model...
Is there a clever way to access the data stored in the face-centered variable? Let's assume I would want to compute the electrical current going through my modeled structure.
As of right now, I save the data stored in the FaceVariable as a tsv-file. This yields a table with (x,y,z)-positions and (ddx, ddy, ddz)-values. I read the file and save the data into arrays to use it in Python. This seems counter-intuitive and really inconvenient. It would be a lot better to be able to access the FaceVariable along certain planes or at certain points.
The documentation does not make it clear, but .faceGrad includes tangential components which account for more than just the neighboring cell center values.
Please see this Jupyter notebook for explicit expressions for the different types of gradients that FiPy can calculate (yes, this stuff should go into the documentation: #560).
The value is accessible with myFaceVar.value and the coordinates with myFaceVar.mesh.faceCenters. FiPy is designed around unstructured meshes and so taking arbitrary slices is not trivial. CellVariable objects support interpolation by calling myCellVar((xs, ys, zs)), but FaceVariable objects do not. See this discussion.
This question is regarding curve fitting in python.
First, I would say that I do not know the curve fit function to insert into "curve_fit" function in the scipy library; therefore, I am trying to use a polyfit which is OK if I am interested in interpolation but my goal is to predict values at future points, in other words extrapolation.
I have attached a screenshot of a raw signal, smoothed and its polyfit result. It has the correct poly order but still fails at extrapolation. My conclusion is that poly fit is not the right approach here, but I can not estimate the curve function. What are you thoughts?
Please note that this is not a distribution since the y values may keep slowly decreasing infinitely, even below 0.
I'd say the function looks like an exponential Gaussian but again it's not a distribution so dont want to do that.
My last thought was to split the plot into two, the first model can certainly be modeled as a polynomial and the second as an exponential. (values are different than first png cuz it's of a different signal).
Then, maybe combine the two. What do you think about this?
Attached is a screenshot of this too.
Since many curves can fit the data and extrapolate differently, you need to choose the right basis functions to get the behaviour you want.
So far you have tried polynomials for instance, these however tend to +- infinite, which is perhaps not what you want.
I would try and use curve_fit on a sum of Hermite polynomials or Laguerre polynomials. For instance, for Laguerre polynomials, you could try
a + b*exp(-k x) + c*(1-x)*exp(-k x) + d*(x^2 - 4*x + 2)*exp(-k x) + ...
Python has a lot of convenience functions built in for this, see e.g. https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.polynomials.laguerre.html
Note however that you should also fit k to your data, which you could use curve_fit for.
I am trying to implement least squares:
I have: $y=\theta\omega$
The least square solution is \omega=(\theta^{T}\theta)^{-1}\theta^{T}y
I tryied:
import numpy as np
def least_squares1(y, tx):
"""calculate the least squares solution."""
w = np.dot(np.linalg.inv(np.dot(tx.T,tx)), np.dot(tx.T,y))
return w
The problem is that this method becomes quickly unstable
(for small problems its okay)
I realized that, when I compared the result to this least square calculation:
import numpy as np
def least_squares2(y, tx):
"""calculate the least squares solution."""
a = tx.T.dot(tx)
b = tx.T.dot(y)
return np.linalg.solve(a, b)
Compare both methods:
I tried to fit data with a polynomial of degree 12 [1, x,x^2,x^3,x^4...,x^12]
First method:
Second method:
Do you know why the first method diverges for large polynomials ?
P.S. I only added "import numpy as np" for your convinience, if you want to test the functions.
There are three points here:
One is that it is generally better (faster, more accurate) to solve linear equations rather than to compute inverses.
The second is that it's always a good idea to use what you know about a system of equations (e.g. that the coefficient matrix is positive definite) when computing a solution, in this case you should use numpy.linalg.lstsq
The third is more specifically about polynomials. When using monomials as a basis, you can end up with a very poorly conditioned coefficient matrix, and this will mean that numerical errors tend to be large. This is because, for example, the vectors x->pow(x,11) and x->pow(x,12) are very nearly parallel. You would get a more accurate fit, and be able to use higher degrees, if you were to use a basis of orthogonal polynomials, for example https://en.wikipedia.org/wiki/Chebyshev_polynomials or https://en.wikipedia.org/wiki/Legendre_polynomials
I am going to improve on what was said before. I answered this yesterday.
The problem with higher order polynomials is something called Runge's phenomena. The reason why the person resorted orthogonal polynomials which are known as Hermite polynomials is that they attempt to get rid of the Gibbs phenomenon which is an adverse oscillatory effect when Fourier series methods are applied to non-periodic signals.
You can sometimes improve under the conditioning be resorting to regularizing methods if the matrix is low rank as I did in the other post. Other parts may be due to smoothness properties of the vector.
I need to do a Fourier transform of a map in Python. Fast Fourier Transforms expect periodic boundary conditions, but the input map is not periodic. So I need to apply an input filter/weight slowly tapering the map toward zero at the edges. Are there libraries for doing this in python?
My favorite function to apodize a map is the generalized Gaussian (also called 'Super-Gaussian' which is a Gaussian whose exponent is raised to a power P. By setting P to, say, 4 or 6 you get a flat-top pulse which falls off smoothly, which is good for FFT applications where sharp edges always create ripples in conjugate space.
The generalized Gaussian is available on Scipy. Here is a minimal code (Python 3) to apodize a 2D array with a generalized Gaussian. As noted in previous comments, there are dozens of functions which would work just as well.
import numpy as np
from scipy.signal import general_gaussian
# A 128x128 array
array = np.random.rand(128,128)
# Define a general Gaussian in 2D as outer product of the function with itself
window = np.outer(general_gaussian(128,6,50),general_gaussian(128,6,50))
# Multiply
ap_array = window*array
Such tapering is often referred to as a "window".
Scipy has many window functions.
You can use numpy.expand_dims to create the 2D window you want.
Regarding Stefan's comment, apparently the numpy team thinks that including more than arrays was a mistake. I would stick to using scipy for signal processing. Watch out, as they moved quite a bit of functions around in their 1.0 release so older documentation is, well, quite old.
As a final note: a "filter" is typically reserved for multiplications you apply in the Frequency domain, not spatial domain.