I have written code to plot the average squared error of a linear function over a given dataset, to visualise progress during a gradient descent training for the optimum regression line.
The relevant bits are these:
def compute_error(f, X, Y):
e = lambda x, y : (y - f(x))**2
return sum(e(x, y) for (x, y) in zip(X, Y))/len(X)
mn, bn, density = abs(target_slope)*1.5, abs(target_intercept)*1.5, 20
M, B = map(list, zip(*[(m, b) for m in np.linspace(-mn, +mn, density)
for b in np.linspace(-bn, +bn, density)]))
E = [compute_error(lambda x : m*x+b, X, Y) for m, b in zip(M,B)]
This works, but is very messy. I suspect there might be a very succinct way to pull off the same thing with numpy. So far I have gotten this:
M, B = map(np.ndarray.flatten, np.mgrid[-mn:+mn:1/density, -bn:+bn:1/density])
I still don't know how to improve the instantiation of E, and for some reason right now it is a lot slower than the messy version.
So, what would be a good way to map over a plane like MXB with numpy?
If you want to run above code you can build X and Y like so:
import numpy as np
from numpy.random import normal
target_slope = 3
target_intercept = 15
def generate_random_data(slope=1, minx=0, maxx=100, n=200, intercept=0):
f = lambda x : normal(slope*x, maxx/5)+intercept
X = np.linspace(minx, maxx, n)
Y = [f(x) for x in X]
return X, Y
X, Y = generate_random_data(slope=target_slope, intercept=target_intercept)
def compute_error(f, X, Y):
return np.mean( (Y - f(X))**2 )
MB = np.mgrid[-mn:+mn:2*mn/density, -bn:+bn:2*bn/density]
MB = MB.reshape((2, -1)).T
E = [compute_error(lambda x : m*x+b, X, Y) for m, b in MB]
It is possible to write a full numpy solution:
Y = np.array(Y)
M, B = np.mgrid[-mn:+mn:2*mn/density, -bn:+bn:2*bn/density]
mx = M.reshape((-1,1))*X
b = B.reshape((-1,1))*np.ones_like(X)
E = np.mean( (mx+b - Y)**2, axis=1 )
It may also be possible to write a solution without using the need to flatten the arrays and obtain the error as a 2D array...
I don't fully follow what you're trying to achieve here. However, this may help get you started with a numpy solution:
X, Y = generate_random_data(slope=target_slope, intercept=target_intercept, n=180)
M, B = np.mgrid[-mn:+mn:1/density, -bn:+bn:1/density]
f = M.T*X + B.T
error = np.sum((f-Y)**2)
Note I've had to alter the default number of X,Y values
Related
I have N datapoints in 3d that lie on a line. The y-direction is fixed, so I want to fit x,z against y.
Lets say we have 6 datapoints, that align with the y axis:
x=[0,0,0,0,0,0]
y=[1,2,3,4,5,6]
z=[0,0,0,0,0,0]
what I want to do:
I want to get the best set of fitting parameters, the gof and fitting error.
So far with a least squarefit, I get a reduced chi2 of < 1, which means I might be overfitting (or misunderstanding something).
Questions:
1.) For example, for the above example I receive a reduced chi2 of 0- this seems false to me?
2.) Also, I am wondering if a least square fit is adequate for this as well- maybe someone can shed some insight on this? Would svd be a better choice for this?
import scipy.optimize
import numpy as np
#define a model (line)
def linear(params, y):
a, b = params
data = [a * y[i] + b for i in range(0, len(y))]
return data
#define the residuals that need to me minimized
def fitting_cost(params, x, y, z):
a_x, b_x, a_z, b_z = params
x_pred = linear((a_x, b_x), y)
z_pred = linear((a_z, b_z), y)
res_x = [x_pred[i] - x[i] for i in range(0, 6)]
res_z = [z_pred[i] - z[i] for i in range(0, 6)]
return res_x + res_z
#do the fit and return parameters plus gof
def least_squares_fit(x, y, z):
sp = [0,0,0,0]
result = scipy.optimize.leastsq(fitting_cost, sp,
args=(x, y, z),
full_output=True)
s_sq = (result[2]['fvec'] ** 2).sum() / (
len(result[2]['fvec']) - len(result[0]))
return result[0], s_sq
I am trying to implement Complex Exponential Fourier Series for f(x) defined on [-L,L] using these formulas,
I want to be able to implement these without calling the Fourier functions in other libraries since I want to also understand what's going on. Here is my attempt,
import numpy as np
from matplotlib import pyplot as plt
steps = 100
dt = 1/steps
L = np.pi
t = np.linspace(-L, L, steps)
def constant(X, Y, n):
return (1/(2*L))*sum([y*np.exp((1j*n*np.pi*t)/L)*dt for t, y in zip(X, Y)])
def complex_fourier(X, Y, N):
_X, _Y = [], []
for t in X:
f = 0
for n in range(-N//2, N//2 + 1):
c = constant(X, Y, n)
f += c*np.exp((-1j*n*np.pi*t)/L)
_X += [f.real]
_Y += [f.imag]
return _X, _Y
X, Y = complex_fourier(t, np.sin(t), 50)
plt.plot(X, Y, 'k.')
# plt.plot(t, np.sin(t))
plt.show()
The plot seems to be almost random and does not improve with more c terms. Could someone point out exactly what am I doing wrong?
Just to answer the "random plot" part of the question for now - note the Y-scale of your plot!
>>> np.min(Y), np.max(Y)
(-6.1937063114043705e-18, 6.43002899658067e-18)
>>> np.min(X), np.max(X)
(-0.15754356079010426, 0.15754356079010395)
In other words, all of your coefficients are basically real valued. You probably wouldn't be interested in an plot of the imaginary part vs the real part, but rather the sum of squares vs the frequency or mode number.
I want to calculate the divergent of a given vector with sympy. Is there any function in python responsible for this? I looked for something in the functions of einsteinpy, but I still haven't found any that help.
Basically I want to calculate \nabla_\mu (n v^\mu)=0 from a given vector v; n being a constant number.
\nabla_\mu (nv^\mu)=0 represents a divergence where \mu will take the derivative with respect to x, y or z of the vector element corresponding to the component. For example:
\nabla_\mu (n v^\mu) = \partial_x (u^x) + \partial_y(u^y) + \partial_z(u^z)
u can be something like (2x,4y,6z)
I appreciate any help.
As shown by #mikuszefski, you can use the module sympy.vector such that you have the implementation of the divergence in a space.
Another way to do what you want is to use the function derive_by_array to get a tensor and do einsten contraction.
import sympy as sp
x, y, z = sp.symbols("x y z") # dim = 3
# Now the functions that you want:
u, v, w = 2*x, 4*y, 6*z
# In a more general way, you can do:
u = sp.Function("u")(x, y, z)
v = sp.Function("v")(x, y, z)
w = sp.Function("w")(x, y, z)
U = sp.Array([u, v, w]) # U is a vector of dim = 3 (or sympy.Array)
X = sp.Array([x, y, z]) # X is a vector of dim = 3 (or sympy.Array)
dUdX = sp.derive_by_array(U, X) # dUdX is a tensor of dim = 3 and order = 2
# Frist way:
divU = sp.trace(sp.Matrix(sp.derive_by_array(U, X))) # Limited
# Second way:
divU = sp.tensorcontraction(sp.derive_by_array(U, X), (0, 1)) # More general
This solution works fine when dim = 2 for example, but you must have that len(X) == len(U)
I have 3 arrays of data to integrate over and need to return a 3d array. This is my code
import numpy as np
from scipy import integrate
def f(x, y, z):
return x + y + 2*z
a = np.arange(64)
b = np.arange(100)
c = np.arange(100)
result = []
for x in a:
for y in b:
for z in c:
result.append(integrate.quad(f, 0, x, (y, z))[0])
result_1 = np.array(result).reshape(len(a), len(b), len(c))
print(result_1)
It works, but this code is so slow and I need to handle problems bigger than this. Is there any other method, something like broadcasting? And the following is the function I need to handle:
import numpy as np
from scipy import integrate, special
def f(v,r):
alpha = 0.5
chi = 1/2*special.erf((r+1.2)/0.3)-1/2*special.erf((r-1.2)/0.3)
return 4/(np.sqrt(2*np.pi*alpha))*chi*np.exp(-v**2/(2*alpha))
def E_1(r):
def f_1(v,r):
return r*f(v,r)
a = 0
b = r
g = lambda x: -np.inf
h = lambda x: np.inf
return integrate.dblquad(f_1, a, b, g, h)
def E_f(tau, xi_1, xi_2):
return E_1(xi_1*np.cos(tau) + xi_2*np.sin(tau))[0]*(-np.sin(tau))
I need to input tau, xi_1, xi_2 as three arrays, and return to a 3D array. And its arrange like coordinate, every coordinate point corresponds to a result. Just like the first example.
I'm new to Python so please be patient. I appreciate any help!
What I have: three 1D lists (xr, yr, zr), one containing x-values, the other two y- and z-values
What I want to do: create a 3D contour plot in matplotlib
I realized that I need to convert the three 1D lists into three 2D lists, by using the meshgrid function.
Here's what I have so far:
xr = np.asarray(xr)
yr = np.asarray(yr)
zr = np.asarray(zr)
X, Y = np.meshgrid(xr,yr)
znew = np.array([zr for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = znew.reshape(X.shape)
Running this gives me the following error (for the last line I entered above):
total size of new array must be unchanged
I went digging around stackoverflow, and tried using suggestions from people having similar problems. Here are the errors I get from each of those suggestions:
Changing the last line to:
Z = znew.reshape(X.shape[0])
Gives the same error.
Changing the last line to:
Z = znew.reshape(X.shape[0], len(znew))
Gives the error:
Shape of x does not match that of z: found (294, 294) instead of (294, 86436).
Changing it to:
Z = znew.reshape(X.shape, len(znew))
Gives the error:
an integer is required
Any ideas?
Well,sample code below works for me
import numpy as np
import matplotlib.pyplot as plt
xr = np.linspace(-20, 20, 100)
yr = np.linspace(-25, 25, 110)
X, Y = np.meshgrid(xr, yr)
#Z = 4*X**2 + Y**2
zr = []
for i in range(0, 110):
y = -25.0 + (50./110.)*float(i)
for k in range(0, 100):
x = -20.0 + (40./100.)*float(k)
v = 4.0*x*x + y*y
zr.append(v)
Z = np.reshape(zr, X.shape)
print(X.shape)
print(Y.shape)
print(Z.shape)
plt.contour(X, Y, Z)
plt.show()
TL;DR
import matplotlib.pyplot as plt
import numpy as np
def get_data_for_mpl(X, Y, Z):
result_x = np.unique(X)
result_y = np.unique(Y)
result_z = np.zeros((len(result_x), len(result_y)))
# result_z[:] = np.nan
for x, y, z in zip(X, Y, Z):
i = np.searchsorted(result_x, x)
j = np.searchsorted(result_y, y)
result_z[i, j] = z
return result_x, result_y, result_z
xr, yr, zr = np.genfromtxt('data.txt', unpack=True)
plt.contourf(*get_data_for_mpl(xr, yr, zr), 100)
plt.show()
Detailed answer
At the beginning, you need to find out for which values of x and y the graph is being plotted. This can be done using the numpy.unique function:
result_x = numpy.unique(X)
result_y = numpy.unique(Y)
Next, you need to create a numpy.ndarray with function values for each point (x, y) from zip(X, Y):
result_z = numpy.zeros((len(result_x), len(result_y)))
for x, y, z in zip(X, Y, Z):
i = search(result_x, x)
j = search(result_y, y)
result_z[i, j] = z
If the array is sorted, then the search in it can be performed not in linear time, but in logarithmic time, so it is enough to use the numpy.searchsorted function to search. but to use it, the arrays result_x and result_y must be sorted. Fortunately, sorting is part of the numpy.unique method and there are no additional actions to do. It is enough to replace the search (this method is not implemented anywhere and is given simply as an intermediate step) method with np.searchsorted.
Finally, to get the desired image, it is enough to call the matplotlib.pyplot.contour or matplotlib.pyplot.contourf method.
If the function value does not exist for (x, y) for all x from result_x and all y from result_y, and you just want to not draw anything, then it is enough to replace the missing values with NaN. Or, more simply, create result_z as numpy.ndarray` from NaN and then fill it in:
result_z = numpy.zeros((len(result_x), len(result_y)))
result_z[:] = numpy.nan