I have been using Matlab for a few years which is quite easy (in my opinion) and powerful when it comes to 3D-plots such as surf, contour or contourf.
It seems at least more unintuitive to me to do the same in Python.
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(0,100,0.1) # time domain
sp = np.arange(0,50,0.2) # spatial domain
c = 0.5
u0 = np.exp(-(sp-5)**2)
u = np.empty((len(t),len(sp))
for i in range(0,len(t)):
u[i][:] = u0*(sp-c*t)
fig = plt.figure()
ax = fig.add_subplot(111,projection='3d')
ax.plot_surface(t,sp,u)
plt.show()
So, in Matlab it would be that easy I think.
What do I have to do in order to get a 3D-Plot (surface or whatever) with two arrays for the x and y dimensions with different sizes and a z-matrix giving a value to each grid point?
As this is a basic question, feel free to explain a bit more or just give me a link with an answer. Unfortunately, I do not really understand what is happening in the codes I read regarding this problem so far.
I don't think what you have written would work in matlab either (I may be wrong, I haven't used it in a while).
To do a plot_surface(X, Y, Z), X, Y, Z must be 2D arrays of equal size. So, just like you would do in matlab:
T, SP = numpy.meshgrid(t, sp)
plot_surface(T, SP, u)
Related
I have implemented this Heart curve (Wolfram Mathworld) via an implicit function script I found here Plot implicit equations in Python 3
import matplotlib.pyplot as plt
import numpy as np
delta = 0.025
xrange = np.arange(-2, 2, delta)
yrange = np.arange(-2, 2, delta)
X, Y = np.meshgrid(xrange,yrange)
# F is one side of the equation, G is the other
F = X**2
G = 1- (Y - (np.abs(X))**(2/3))**2
plt.contour((F - G), [0])
plt.show()
How would I extract the resulting array of x,y pairs defining this curve?
One could say this is "cheating" a bit, but a rough, concise recipe can be getting points in the mesh that roughly satisfy the condition and getting their coordinates as follows:
# subset points that are very close to satifying the exact condition
# and get their indices
x_id, y_id = np.nonzero(abs(F-G) < 8e-3)
# get coordinates corresponding to the indices
plt.scatter(xrange[x_id], yrange[y_id])
xrange[x_id], yrange[y_id] are the desired arrays of coordinates.
I'm not really a matplotlib user and there may be much, much better ways of doing what you ask - so please wait and see if someone comes up with a proper answer...
However, for the moment, you appear to be plotting the array F-G, so I saved that as a TIF (because it can store floats) using PIL like this:
from PIL import Image
...
your code
...
contour = (F-G).astype(np.float32)
Image.fromarray(contour).save('contour.tif')
that gives this:
Inspection of contour shows that the range is -1 to 16:
print(contour.min(), contour.max())
-1.0 15.869446307662544
So, I deduce that your points are probably the ones near zero:
points = (contour>-0.1) & (contour<0.1)
So, I can get the list of X,Y with:
Y, X = np.where(points)
I tried to generate a 1000 points in 2D uniformly distributed on a rectangle of dimensions [-1,1]x[0,0.5], then plot the points, but I couldn't. I get this error.
typeError: 'float' object cannot be interpreted as an integer.
Here is the code I came up with:
import matplotlib.pyplot as plt
vect = np.random.uniform(1000)
plt.plot(range(-1,1), range(0,0.5), vect)
plt.show()
I think I don't really understand how to do it. Should I use np.random.randn or np.random.rand? I would have to have some explanations about what I did wrong. (if possible to each line so that i can better understand)
Thank you
first: xlist = np.random.uniform(1000,low=-1, high = 1)
second ylist = np.random.uniform(1000,low=0, high = 0.5)
last plt.scatter(xlist, ylist)
my problem is allegedly simple - I have scatter data in X and Y, and want to get a nice, well-fitting trendline with a known equation so that I can go on to correspond LDR voltages into power readings. However, I'm having trouble with generating a trendline in Matplotlib or Scipy that fits well, which I believe is because there's a logarithmic relationship.
I'm using Spyder and Matplotlib, and first tried plotting the X (Thorlabs) and Y (LDR) data as a log-log scatter plot. Because the data didn't seem to show a linear relationship after doing this, I then used numpy's Polynomial.fit with degree 5 to 6. This looked good, but then when inverting the axes, so I could get something of the form [LDR] = f[Thorlabs], I noticed the fit was suddenly not very good at all at the extremes of my data.
Using this question using curve_fit seems to be the way to go, but I tried using curve_fit as described here and, after adjusting to increase the max number of curve-fit iterations, stumbled when I got the error message "TypeError: can't multiply sequence by non-int of type 'numpy.float64'", which will likely be because my data contains decimal points. I'm not sure how to account for this.
I have several mini-questions, then -
am I misunderstanding the above examples?
is there a better way I could go about trying to find the ideal trendline for this data? Is it possible that it's some sort of logarithmic relationship on top of a log-log plot?
once I get a trendline, how can I make sure it fits well and can be displayed?
#import libraries
import matplotlib.pyplot as plt
import csv
import numpy as np
from numpy.polynomial import Polynomial
import scipy.optimize as opt
#initialise arrays - I create log arrays too so I can plot directly
deg = 6 #degree of polynomial fitting for Polynomial.fit()
thorlabs = []
logthorlabs = []
ldr = []
logldr = []
#read in LDR/Thorlabs datasets from file
with open('16ldr561nm.txt','r') as csvfile:
plots = csv.reader(csvfile, delimiter='\t')
for row in plots:
thorlabs.append(float(row[0]))
ldr.append(float(row[1]))
logthorlabs.append(np.log(float(row[0])))
logldr.append(np.log(float(row[1])))
#This seems to work just fine, I now have arrays containing data in float
#fit and plot log polynomials
p = Polynomial.fit(logthorlabs, logldr, deg)
plt.plot(*p.linspace()) #plot lines
#plot scatter graphs on log-log axis - either using log arrays or on loglog plot
#plt.loglog()
plt.scatter(logthorlabs, logldr, label='16bit ADC LDR1')
plt.xlabel('log Thorlabs laser power (microW)')
plt.ylabel('log LDR voltage (mV)')
plt.title('LDR voltage against laser power at 561nm')
plt.legend()
plt.show()
#attempt at using curve_fit - when using, comment out the above block
"""
# This is the function we are trying to fit to the data.
def func(x, a, b, c):
return a * np.exp(-b * x) + c
#freaks out here as I get a type error which I am not sure how to account for
# Plot the actual data
plt.plot(thorlabs, ldr, ".", label="Data");
#Adjusted maxfev to 5000. I know you can make "guesses" here but I am not sure how to do so
# The actual curve fitting happens here
optimizedParameters, pcov = opt.curve_fit(func, thorlabs, ldr, maxfev=5000);
# Use the optimized parameters to plot the best fit
plt.plot(thorlabs, func(ldr, *optimizedParameters), label="fit");
# Show the graph
plt.legend();
plt.show();
"""
When using curve_fit, I get a "TypeError: can't multiply sequence by non-int of type 'numpy.float64'".
As I don't have enough reputation to post images, my raw dataset can be found here. (Otherwise I'd include the graphs!)
(Note that I actually have two datasets, but as I only want to know the principle for calculating a trendline for one, I've left out the other dataset above.)
Refactoring your code a bit, most importantly to use native Numpy arrays once things have been parsed out from the file, makes things not crash, but the CurveFit line doesn't look good at all.
The code prints out the parameters fit by curve_fit, which don't look very good either, and a warning too: "Covariance of the parameters could not be estimated". I'm no mathematician/statistician, so I don't know what to do there.
from numpy.polynomial import Polynomial
import csv
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize as opt
def read_dataset(filename):
x = []
y = []
with open(filename, "r") as csvfile:
plots = csv.reader(csvfile, delimiter="\t")
for row in plots:
x.append(float(row[0]))
y.append(float(row[1]))
# cast to native numpy arrays
x = np.array(x)
y = np.array(y)
return (x, y)
ldr, thorlabs = read_dataset("16ldr561nm.txt")
plt.scatter(thorlabs, ldr, label="Data")
plt.xlabel("Thorlabs laser power (microW)")
plt.ylabel("LDR voltage (mV)")
plt.title("LDR voltage against laser power at 561nm")
# Generate and plot polynomial
p = Polynomial.fit(thorlabs, ldr, 6)
plt.plot(*p.linspace(), label="Polynomial")
# Generate and plot curvefit
def func(x, a, b, c):
return a * np.exp(-b * x) + c
optimizedParameters, pcov = opt.curve_fit(func, thorlabs, ldr)
print(optimizedParameters, pcov)
plt.plot(thorlabs, func(ldr, *optimizedParameters), label="CurveFit")
# Show everything
plt.legend()
plt.show()
If you really need to log() the data, it's easily done with
x = np.log(x)
y = np.log(y)
which will keep the arrays as NumPy arrays and be plenty faster than doing it "by hand".
I have a big data file containing 3 columns for x, y, z (1,6M x 3 array, 128 MB). I was trying to produce a contourf plot in Matplotlib, but when transforming the data for X,Y grid, I have run out of memory, which was of course very likely to happen.
Here is an example script (I have also tried other numpy/scipy methods for interpolation, but the same MemoryError):
import numpy as np
import matplotlib.pyplot as plt
a=np.loadtxt("data.dat")
xi = np.linspace(min(a[:,0]), max(a[:,0]), 40000)
yi = np.linspace(min(a[:,1]), max(a[:,1]), 40000)
X, Y = np.meshgrid(xi, yi)
Z = griddata(a[:,0], a[:,1], a[:,2], xi, yi, interp='nearest')
plt.contourf(X,Y,Z)
plt.show()
So I tried to use MATLAB for the same procedure and in the end, either I would have to use a very small matrix for interpolation, thus losing a picture that I was interested to obtain (data show quite fast oscillations). Here is a MATLAB script (variable "a" also represents the data array):
x1=a(:,1);
y1=a(:,2);
z1=a(:,3);
xi=linspace(min(x1),max(x1),40000);
yi=linspace(min(y1),max(y1),40000);
[XI YI]=meshgrid(xi,yi);
ZI = griddata(x1,y1,z1,XI,YI,'nearest');
contourf(XI,YI,ZI)
Obviously, it looks the same as in Python :)
Just to show that it is feasible to get what I want, a result from Origin is here
None of the solutions which I found on the web worked for my problem.
I've been having invalid input errors when working with scipy interp2d function. It turns out the problem comes from the bisplrep function, as showed here:
import numpy as np
from scipy import interpolate
# Case 1
x = np.linspace(0,1)
y = np.zeros_like(x)
z = np.ones_like(x)
tck = interpolate.bisplrep(x,y,z) # or interp2d
Returns: ValueError: Invalid inputs
It turned out the test data I was giving interp2d contained only one distinct value for the 2nd axis, as in the test sample above. The bisplrep function inside interp2d considers it as an invalid output:
This may be considered as an acceptable behaviour: interp2d & bisplrep expect a 2D grid, and I'm only giving them values along one line.
On a side note, I find the error message quite unclear. One could include a test in interp2d to deal with such cases: something along the lines of
if len(np.unique(x))==1 or len(np.unique(y))==1:
ValueError ("Can't build 2D splines if x or y values are all the same")
may be enough to detect this kind of invalid input, and raise a more explicit error message, or even directly call the more appropriate interp1d function (which works perfectly here)
I thought I had correctly understood the problem. However, consider the following code sample:
# Case 2
x = np.linspace(0,1)
y = x
z = np.ones_like(x)
tck = interpolate.bisplrep(x,y,z)
In that case, y being proportional to x, I'm also feeding bisplrep with data along one line. But, surprisingly, bisplrep is able to compute a 2D spline interpolation in that case. I plotted it:
# Plot
def plot_0to1(tck):
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
X = np.linspace(0,1,10)
Y = np.linspace(0,1,10)
Z = interpolate.bisplev(X,Y,tck)
X,Y = np.meshgrid(X,Y)
fig = plt.figure()
ax = Axes3D(fig)
ax.plot_surface(X, Y, Z,rstride=1, cstride=1, cmap=cm.coolwarm,
linewidth=0, antialiased=False)
plt.show()
plot_0to1(tck)
The result is the following:
where bisplrep seems to fill the gaps with 0's, as better showed when I extend the plot below:
Regarding of whether adding 0 is expected, my real question is: why does bisplrep work in Case 2 but not in Case 1?
Or, in other words: do we want it to return an error when 2D interpolation is fed with input along one direction only (Case 1 & 2 fail), or not? (Case 1 & 2 should return something, even if unpredicted).
I was originally going to show you how much of a difference it makes for 2d interpolation if your input data are oriented along the coordinate axes rather than in some general direction, but it turns out that the result would be even messier than I had anticipated. I tried using a random dataset over an interpolated rectangular mesh, and comparing that to a case where the same x and y coordinates were rotated by 45 degrees for interpolation. The result was abysmal.
I then tried doing a comparison with a smoother dataset: turns out scipy.interpolate.interp2d has quite a few issues. So my bottom line will be "use scipy.interpolate.griddata".
For instructive purposes, here's my (quite messy) code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.cm as cm
n = 10 # rough number of points
dom = np.linspace(-2,2,n+1) # 1d input grid
x1,y1 = np.meshgrid(dom,dom) # 2d input grid
z = np.random.rand(*x1.shape) # ill-conditioned sample
#z = np.cos(x1)*np.sin(y1) # smooth sample
# first interpolator with interp2d:
fun1 = interp.interp2d(x1,y1,z,kind='linear')
# construct twice finer plotting and interpolating mesh
plotdom = np.linspace(-1,1,2*n+1) # for interpolation and plotting
plotx1,ploty1 = np.meshgrid(plotdom,plotdom)
plotz1 = fun1(plotdom,plotdom) # interpolated points
# construct 45-degree rotated input and interpolating meshes
rotmat = np.array([[1,-1],[1,1]])/np.sqrt(2) # 45-degree rotation
x2,y2 = rotmat.dot(np.vstack([x1.ravel(),y1.ravel()])) # rotate input mesh
plotx2,ploty2 = rotmat.dot(np.vstack([plotx1.ravel(),ploty1.ravel()])) # rotate plotting/interp mesh
# interpolate on rotated mesh with interp2d
# (reverse rotate by using plotx1, ploty1 later!)
fun2 = interp.interp2d(x2,y2,z.ravel(),kind='linear')
# I had to generate the rotated points element-by-element
# since fun2() accepts only rectangular meshes as input
plotz2 = np.array([fun2(xx,yy) for (xx,yy) in zip(plotx2.ravel(),ploty2.ravel())])
# try interpolating with griddata
plotz3 = interp.griddata(np.array([x1.ravel(),y1.ravel()]).T,z.ravel(),np.array([plotx1.ravel(),ploty1.ravel()]).T,method='linear')
plotz4 = interp.griddata(np.array([x2,y2]).T,z.ravel(),np.array([plotx2,ploty2]).T,method='linear')
# function to plot a surface
def myplot(X,Y,Z):
fig = plt.figure()
ax = Axes3D(fig)
ax.plot_surface(X, Y, Z,rstride=1, cstride=1,
linewidth=0, antialiased=False,cmap=cm.coolwarm)
plt.show()
# plot interp2d versions
myplot(plotx1,ploty1,plotz1) # Cartesian meshes
myplot(plotx1,ploty1,plotz2.reshape(2*n+1,-1)) # rotated meshes
# plot griddata versions
myplot(plotx1,ploty1,plotz3.reshape(2*n+1,-1)) # Cartesian meshes
myplot(plotx1,ploty1,plotz4.reshape(2*n+1,-1)) # rotated meshes
So here's a gallery of the results. Using random input z data, and interp2d, Cartesian (left) vs rotated interpolation (right):
Note the horrible scale on the right side, noting that the input points are between 0 and 1. Even its mother wouldn't recognize the data set. Note that there are runtime warnings during the evaluation of the rotated data set, so we're being warned that it's all crap.
Now let's do the same with griddata:
We should note that these figures are much closer to each other, and they seem to make way more sense than the output of interp2d. For instance, note the overshoot in the scale of the very first figure.
These artifacts always arise between input data points. Since it's still interpolation, the input points have to be reproduced by the interpolating function, but it's pretty weird that a linear interpolating function overshoots between data points. It's clear that griddata doesn't suffer from this issue.
Consider an even more clear case: the other set of z values, which are smooth and deterministic. The surfaces with interp2d:
HELP! Call the interpolation police! Already the Cartesian input case has inexplicable (well, at least by me) spurious features in it, and the rotated input case poses the threat of s͔̖̰͕̞͖͇ͣ́̈̒ͦ̀̀ü͇̹̞̳ͭ̊̓̎̈m̥̠͈̣̆̐ͦ̚m̻͑͒̔̓ͦ̇oͣ̐ͣṉ̟͖͙̆͋i͉̓̓ͭ̒͛n̹̙̥̩̥̯̭ͤͤͤ̄g͈͇̼͖͖̭̙ ̐z̻̉ͬͪ̑ͭͨ͊ä̼̣̬̗̖́̄ͥl̫̣͔͓̟͛͊̏ͨ͗̎g̻͇͈͚̟̻͛ͫ͛̅͋͒o͈͓̱̥̙̫͚̾͂.
So let's do the same with griddata:
The day is saved, thanks to The Powerpuff Girls scipy.interpolate.griddata. Homework: check the same with cubic interpolation.
By the way, a very short answer to your original question is in help(interp.interp2d):
| Notes
| -----
| The minimum number of data points required along the interpolation
| axis is ``(k+1)**2``, with k=1 for linear, k=3 for cubic and k=5 for
| quintic interpolation.
For linear interpolation you need at least 4 points along the interpolation axis, i.e. at least 4 unique x and y values have to be present to get a meaningful result. Check these:
nvals = 3 # -> RuntimeWarning
x = np.linspace(0,1,10)
y = np.random.randint(low=0,high=nvals,size=x.shape)
z = x
interp.interp2d(x,y,z)
nvals = 4 # -> no problem here
x = np.linspace(0,1,10)
y = np.random.randint(low=0,high=nvals,size=x.shape)
z = x
interp.interp2d(x,y,z)
And of course this all ties in to you question like this: it makes a huge difference if your geometrically 1d data set is along one of the Cartesian axes, or if it's in a general way such that the coordinate values assume various different values. It's probably meaningless (or at least very ill-defined) to try 2d interpolation from a geometrically 1d data set, but at least the algorithm shouldn't break if your data are along a general direction of the x,y plane.