Matplotlib -3D data visualization - python

here is example of one txt file
My experiment measurements are in several txt files (in reality I will have hundreds of files, but to demonstrate the idea of plotting, here I only list out 3 files, they are d_401.txt, d_402.txt, d_403.txt) each file has 4 columns & 256 rows of data. (only the first & forth column are the data I need for x and z)
I want to plot a 3D surface plot/or contour plot out of these files. In My 3D plot, x-axis is universally the 1st column data from each file, z-axis is the 4th column data from each file (z value also needs to be color-coded in gradient), and finally y-axis is "line-up in y direction" of all the x-z values plot from these three files.
How to generate the python code for this plot? I'm especially confused how to assign the matrix Z, would greatly appreciate if somebody could help me on this issue...I need to have the figure plotted soon.
I attached my pre-mature (supposedly full of error code)
Thanks a million!!!
import numpy as np
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
z_txt = np.array(['d_401.txt', 'd_402.txt', 'd_403.txt'])
zarray = np.zeros([z_txt.size])
y = np.arange(3)
x = np.zeros(256)
Z = np.zeros((len(y),len(x)))
for i in range(z_txt.size):
zarray[i] = np.loadtxt(z_txt[i])
x[i] = zarray[i,0]
X, Y = np.meshgrid(x,y)
Z[i] = zarray[i,3]
ax.plot_surface(X, Y, Z, cmap=cm.magma, shade=True, lw=3)
plt.show()

The following in the hypotesis that all the files contain the same vector of x …
In [146]: import numpy as np
...: import matplotlib.pyplot as plt
...: from matplotlib.cm import ScalarMappable as sm
...: from glob import glob
...:
...: # create fake data and put it in files
...: x0, y0 = np.arange(11.0), np.arange(3)+1.0
...: z0 = x0+y0[:,None]
...: for i in range(3):
...: with open('delendo%d'%(i+1), 'w') as out:
...: for x0z0 in zip(x0, x0+x0, x0-x0, z0[i]):
...: print(*x0z0, sep='\t', file=out)
...:
...: # sorted list of "interesting" files
...: files = sorted(glob('delendo*'))
...:
...: # read the matrix of z values
...: Z = np.array([np.loadtxt(f, usecols=3) for f in files])
...: # read the vector of x values (the same for all files, I hope so!)
...: x = np.loadtxt(files[0], usecols=0)
...: # we have as many y's as rows in Z, and we count from 1
...: y = np.arange(Z.shape[0])+1
...:
...: # prepare for a 3D plot and plot as a surface
...: fig, ax = plt.subplots(constrained_layout=1,
...: subplot_kw={"projection" : "3d"})
...: surf = ax.plot_surface(*np.meshgrid(x,y), Z, cmap='magma')
...: norm = plt.Normalize(*surf.get_clim())
...: plt.colorbar(sm(norm=norm, cmap='magma'))
...: plt.show()
Addendum
to address some questions raised in comment from OP
You ask about the *sequence operator. It is the unpack operator, that can be used either in an assignment or in an expression. Let's see
>>> tup = (1,2,3,4,5)
>>> a = tup ; print(a)
(1, 2, 3, 4, 5)
>>> a, b = tup
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 2)
But if you use the unpack operator
>>> a, *b = tup ; print(a, b)
1 [2, 3, 4, 5]
Python understand that b is a sequence and stores in this sequence the remaining elements of tup — incidentally, b is a list not a tuple.
The unpack operator can be used in the middle of a left member, but just once because Python assign to the starred item what is left, and two starred items lead to ambiguity.
>>> a,*b, c = tup ; print(a, b, c)
1 [2, 3, 4] 5
>>> a,*b, *c = tup ; print(a, b, c)
File "<stdin>", line 1
SyntaxError: multiple starred expressions in assignment
Now, let's see the unpack operator at work in an expression
>>> print(*b)
2 3 4
using the starred syntax the list is unpacked, it's like
>>> print(b[0], b[1], b[2])
2 3 4
Now, coming to your question, you have already used unpacking in your code. even if you possibly was not aware of it…
X, Y = np.meshgrid(x,y)
Coming to my code
surf = ax.plot_surface(*np.meshgrid(x,y), Z, cmap='magma')
at this point it should be clear, we are unpacking what is returned by meshgrid, it's like X, Y = meshgrid(...) ; surf = ...(X, Y, Z, ...) with the (small) advantage that the grid arrays (X, Y) are not referenced and the memory they use can be immediately given back to the interpreter.
With respect to your last point, to create a colorbar Matplotlib needs what is called a scalar mappable (s.m.) and plt.colorbar can check the active Axes to see if there is a s.m. hanging around.
Many Artists (the software agents that draw into the Axes) like e.g., imshow create a s.m. and all is well, but plot_surface won't, so the poor programmer has to provide a hand-made s.m. to colorbar.
To specify a s.m. we need ① a norm and ② a colormap.
The norm is the default, i.e., plt.Normalize, we need the limits of Z and these are the value of surf.get_clim(), again I used the starred syntax to unpack the two values.
The length of this explanation is a justification for having it omitted in the first place…

Related

EDIT:plotting the function, x and y must have the same first dimension

UPDATE:
Using the answer from before, I was able to call the function. Now, I tried to increase the difficulty a step-further.
I understand that using this works:
def sinesum(t, b):
return sum(b*sin(n*t))
for i in range(0, 10, 1):
b = i
n = i
t = i
print(sinesum(i,i))
Although, I want to be able to plot it with:
import matplotlib.pyplot as plt
t = np.linspace(-10, 10, 20)
plt.plot(t, sinesum(i,i))
plt.show
I get nothing, how do I plot with the function output as y?
when I remove (i, i) and include (t, b) I get
x and y must have the same first dimension, but have shapes (20,) and (1,)
I understand that this is because the function only calls a single value, how do I get it so that sinesum(i,i) will return the right amount of dimensions for the plot?
You should calculate every value before plotting it:
res = []
for v in t:
res.append(sinesum(v,b))
plt.plot(t,res)
or using list comprehension:
plt.plot(t, [sinesum(v,b) for v in t])
Did you meen?
def f(x):
return 4*x + 1
for i in range(100):
print(f(i))

How to plot multiple points from a list using matplotlib?

I have read a list of 3D points from a text file. The list looks like follows:
content = ['2.449,14.651,-0.992,', '6.833,13.875,-1.021,', '8.133,17.431,-1.150,', '3.039,13.724,-0.999,', '16.835,9.456,-1.031,', '16.835,9.457,-1.031,', '15.388,5.893,-0.868,', '13.743,25.743,-1.394,', '14.691,24.988,-1.387,', '15.801,25.161,-1.463,', '14.668,23.056,-1.382,', '22.378,20.268,-1.457,', '21.121,17.041,-1.353,', '19.472,13.555,-1.192,', '22.498,20.115,-1.436,', '13.344,-33.672,-0.282,', '13.329,-33.835,-0.279,', '13.147,-30.690,-0.305,', '13.097,-28.407,-0.339,', '13.251,-28.643,-0.366,', '13.527,-25.067,-0.481,', '19.433,-33.137,-0.408,', '19.445,-29.501,-0.345,', '20.592,-28.004,-0.312,', '19.109,-26.512,-0.380,', '18.521,-24.155,-0.519,', '22.837,48.245,-2.201,', '23.269,50.129,-2.282,', '23.499,46.652,-2.297,', '23.814,48.646,-2.271,', '30.377,46.501,-2.214,', '29.869,44.479,-2.143,', '29.597,41.257,-2.018,', '28.134,40.291,-2.159,', '-40.932,-0.320,-1.390,', '-36.808,0.442,-1.382,', '-30.831,0.548,-1.288,', '-29.404,1.235,-1.300,', '-26.453,1.424,-1.261,', '-30.559,2.775,-1.249,', '-27.714,3.439,-1.201,']
I want to plot all the points on a 3D plot. I have this so far:
#!/usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
with open("measurements.txt") as f:
content = f.read().splitlines()
#print content
for value in content:
x, y, z = value.split(',')
#print x, y, z
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.scatter(x, y, z)
fig.savefig('scatterplot.png')
It throws an error:
Traceback (most recent call last): File "plotting.py", line 11, in
x, y, z = value.split(',')
ValueError: too many values to unpack
How do I plot these points? Thank you for your help.
First of all you need to take the values into respective arrays by spitting lines in file then pass them to the function.
content = ['2.449,14.651,-0.992,', '6.833,13.875,-1.021,', '8.133,17.431,-1.150,', '3.039,13.724,-0.999,', '16.835,9.456,-1.031,', '16.835,9.457,-1.031,', '15.388,5.893,-0.868,', '13.743,25.743,-1.394,', '14.691,24.988,-1.387,', '15.801,25.161,-1.463,', '14.668,23.056,-1.382,', '22.378,20.268,-1.457,', '21.121,17.041,-1.353,', '19.472,13.555,-1.192,', '22.498,20.115,-1.436,', '13.344,-33.672,-0.282,', '13.329,-33.835,-0.279,', '13.147,-30.690,-0.305,', '13.097,-28.407,-0.339,', '13.251,-28.643,-0.366,', '13.527,-25.067,-0.481,', '19.433,-33.137,-0.408,', '19.445,-29.501,-0.345,', '20.592,-28.004,-0.312,', '19.109,-26.512,-0.380,', '18.521,-24.155,-0.519,', '22.837,48.245,-2.201,', '23.269,50.129,-2.282,', '23.499,46.652,-2.297,', '23.814,48.646,-2.271,', '30.377,46.501,-2.214,', '29.869,44.479,-2.143,', '29.597,41.257,-2.018,', '28.134,40.291,-2.159,', '-40.932,-0.320,-1.390,', '-36.808,0.442,-1.382,', '-30.831,0.548,-1.288,', '-29.404,1.235,-1.300,', '-26.453,1.424,-1.261,', '-30.559,2.775,-1.249,', '-27.714,3.439,-1.201,']
import numpy as np
import matplotlib.pyplot as plt
#with open("measurements.txt") as f:
#content = f.read().splitlines()
#print content
#for value in content:
# x, y, z = value.split(',')
x = [float(i.split(',')[0]) for i in content]
y = [float(i.split(',')[1]) for i in content]
z = [float(i.split(',')[2]) for i in content]
#print(x, y, z)
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.scatter(x, y, z)
fig.savefig('scatterplot.png')
output
It's clear ! when you do your split there is 4 values
content = ['2.449,14.651,-0.992,', '6.833,13.875,-1.021,', '8.133,17.431,-1.150,', '3.039,13.724,-0.999,', '16.835,9.456,-1.031,', '16.835,9.457,-1.031,', '15.388,5.893,-0.868,', '13.743,25.743,-1.394,', '14.691,24.988,-1.387,', '15.801,25.161,-1.463,', '14.668,23.056,-1.382,', '22.378,20.268,-1.457,', '21.121,17.041,-1.353,', '19.472,13.555,-1.192,', '22.498,20.115,-1.436,', '13.344,-33.672,-0.282,', '13.329,-33.835,-0.279,', '13.147,-30.690,-0.305,', '13.097,-28.407,-0.339,', '13.251,-28.643,-0.366,', '13.527,-25.067,-0.481,', '19.433,-33.137,-0.408,', '19.445,-29.501,-0.345,', '20.592,-28.004,-0.312,', '19.109,-26.512,-0.380,', '18.521,-24.155,-0.519,', '22.837,48.245,-2.201,', '23.269,50.129,-2.282,', '23.499,46.652,-2.297,', '23.814,48.646,-2.271,', '30.377,46.501,-2.214,', '29.869,44.479,-2.143,', '29.597,41.257,-2.018,', '28.134,40.291,-2.159,', '-40.932,-0.320,-1.390,', '-36.808,0.442,-1.382,', '-30.831,0.548,-1.288,', '-29.404,1.235,-1.300,', '-26.453,1.424,-1.261,', '-30.559,2.775,-1.249,', '-27.714,3.439,-1.201,']
Solution:
for value in content:
x, y, z,parasitic_value = value.split(',')
The element in content are:
'2.449,14.651,-0.992,'
A slightly different way to extract the data to plot from this string is to consider it as a tuple, and to use eval().
data = [eval("("+x[:len(x)-1]+")") for x in content]
Which returns:
[(2.449, 14.651, -0.992),
(6.833, 13.875, -1.021),
(8.133, 17.431, -1.15),
...
(-30.559, 2.775, -1.249),
(-27.714, 3.439, -1.201)]
EDIT: the error you got means:
You want 3 values, X, Y and Z; but when I split at ",", There are more (too many values to unpack).
content[0].split(",")
Out[4]: ['2.449', '14.651', '-0.992', '']
I see at least one error in there.
The most obvious one (because you got an error), is in splitting.
The third comma at the end is causing the string to be split into four elements
>>> l = 'a,b,c,'
>>> l.split(',')
['a', 'b', 'c', '']
you can work around that by using:
x,y,z,_ = value.split(',')
the next problem you'll run into is with your loop
for value in content:
x, y, z = value.split(',')
you are only storing the last of your values, since you overwrite them multiple times.
The easiest way to work around this is creating three lists and appending into them:
x = []
y = []
z = []
for measurement in content:
a,b,c,_ = measurement.split(',')
x.append(a)
y.append(b)
z.append(c)
This is not the most efficient way, but I think it should be easier to understand.
I recommend using it like this:
x = []
y = []
z = []
with open('measurements.txt') as file:
for line in file:
a,b,c,_ = line.split(',')
x.append(a)
y.append(b)
z.append(c)
To solve the main issue , you have to edit the list , and make it a 3d numpy array , by copying all the values , traversing the list via re.
Rather than assuming the list as multiple points , try to take the first 2 points or 3 points as a image/3D graph , and use imshow or Axes3D to plot it.

pyplot, plotting from left to right

I have some data I want to plot, x and y is in the same format as this small piece of example code.
import matplotlib.pyplot as plt
y = [1,1,3,4]
x = [1,4,2,3]
plt.plot(x,y,'-o')
plt.show()
This results in quite a weird graph.
What pyplot does is drawing a line from the first point inserted to the second, then to the third etc.
I want it to draw a line from low-x to high-x, but I can seem to find a nice way to do this. I want my line to be like this.
What is the easiest way to achieve this, given my x and y data is in the same format but more complex than this example?
To get the graph as you mentioned, you need to have values in x in sorted order, which you can achieve like this:
z = sorted(zip(x,y))
x=[i[0] for i in z]
y=[i[1] for i in z]
and now using x and y for ploting (not tested).
you can sort your x list with simultaneously changing the y,
import matplotlib.pyplot as plt
y = [1,1,3,4]
x = [1,4,2,3]
for i in range(len(x)):
for k in range( len( x ) - 1, i, -1 ):
if ( x[k] < x[k - 1] ):
x[k-1],x[k]=x[k],x[k-1]
y[k-1],y[k]= y[k],y[k-1]
print x,y
plt.plot(x,y,'-o')
plt.show()

Correct usage of scipy.interpolate.RegularGridInterpolator

I am a little confused by the documentation for scipy.interpolate.RegularGridInterpolator.
Say for instance I have a function f: R^3 => R which is sampled on the vertices of the unit cube. I would like to interpolate so as to find values inside the cube.
import numpy as np
# Grid points / sample locations
X = np.array([[0,0,0], [0,0,1], [0,1,0], [0,1,1], [1,0,0], [1,0,1], [1,1,0], [1,1,1.]])
# Function values at the grid points
F = np.random.rand(8)
Now, RegularGridInterpolator takes a points argument, and a values argument.
points : tuple of ndarray of float, with shapes (m1, ), ..., (mn, )
The points defining the regular grid in n dimensions.
values : array_like, shape (m1, ..., mn, ...)
The data on the regular grid in n dimensions.
I interpret this as being able to call as such:
import scipy.interpolate as irp
rgi = irp.RegularGridInterpolator(X, F)
However, when I do so, I get the following error:
ValueError: There are 8 point arrays, but values has 1 dimensions
What am I misinterpreting in the docs?
Ok I feel silly when I answer my own question, but I found my mistake with help from the documentation of the original regulargrid lib:
https://github.com/JohannesBuchner/regulargrid
points should be a list of arrays that specifies how the points are spaced along each axis.
For example, to take the unit cube as above, I should set:
pts = ( np.array([0,1.]), )*3
or if I had data which was sampled at higher resolution along the last axis, I might set:
pts = ( np.array([0,1.]), np.array([0,1.]), np.array([0,0.5,1.]) )
Finally, values has to be of shape corresponding to the grid laid out implicitly by points. For example,
val_size = map(lambda q: q.shape[0], pts)
vals = np.zeros( val_size )
# make an arbitrary function to test:
func = lambda pt: (pt**2).sum()
# collect func's values at grid pts
for i in range(pts[0].shape[0]):
for j in range(pts[1].shape[0]):
for k in range(pts[2].shape[0]):
vals[i,j,k] = func(np.array([pts[0][i], pts[1][j], pts[2][k]]))
So finally,
rgi = irp.RegularGridInterpolator(points=pts, values=vals)
runs and performs as desired.
Your answer is nicer, and it's perfectly OK for you to accept it. I'm just adding this as an "alternate" way to script it.
import numpy as np
import scipy.interpolate as spint
RGI = spint.RegularGridInterpolator
x = np.linspace(0, 1, 3) # or 0.5*np.arange(3.) works too
# populate the 3D array of values (re-using x because lazy)
X, Y, Z = np.meshgrid(x, x, x, indexing='ij')
vals = np.sin(X) + np.cos(Y) + np.tan(Z)
# make the interpolator, (list of 1D axes, values at all points)
rgi = RGI(points=[x, x, x], values=vals) # can also be [x]*3 or (x,)*3
tst = (0.47, 0.49, 0.53)
print rgi(tst)
print np.sin(tst[0]) + np.cos(tst[1]) + np.tan(tst[2])
returns:
1.93765972087
1.92113615659

Using scipy's kmeans2 function in python

I found this example for using kmeans2 algorithm in python. I can't get the following part
# make some z vlues
z = numpy.sin(xy[:,1]-0.2*xy[:,1])
# whiten them
z = whiten(z)
# let scipy do its magic (k==3 groups)
res, idx = kmeans2(numpy.array(zip(xy[:,0],xy[:,1],z)),3)
The points are zip(xy[:,0],xy[:,1]), so what is the third value z doing here?
Also what is whitening?
Any explanation is appreciated. Thanks.
First:
# make some z vlues
z = numpy.sin(xy[:,1]-0.2*xy[:,1])
The weirdest thing about this is that it's equivalent to:
z = numpy.sin(0.8*xy[:, 1])
So I don't know why it's written that way. maybe there's a typo?
Next,
# whiten them
z = whiten(z)
whitening is simply normalizing the variance of the population. See here for a demo:
>>> z = np.sin(.8*xy[:, 1]) # the original z
>>> zw = vq.whiten(z) # save it under a different name
>>> zn = z / z.std() # make another 'normalized' array
>>> map(np.std, [z, zw, zn]) # standard deviations of the three arrays
[0.42645, 1.0, 1.0]
>>> np.allclose(zw, zn) # whitened is the same as normalized
True
It's not obvious to me why it is whitened. Anyway, moving along:
# let scipy do its magic (k==3 groups)
res, idx = kmeans2(numpy.array(zip(xy[:,0],xy[:,1],z)),3)
Let's break that into two parts:
data = np.array(zip(xy[:, 0], xy[:, 1], z))
which is a weird (and slow) way of writing
data = np.column_stack([xy, z])
In any case, you started with two arrays and merge them into one:
>>> xy.shape
(30, 2)
>>> z.shape
(30,)
>>> data.shape
(30, 3)
Then it's data that is passed to the kmeans algorithm:
res, idx = vq.kmeans2(data, 3)
So now you can see that it's 30 points in 3d space that are passed to the algorithm, and the confusing part is how the set of points were created.

Categories

Resources