Python scatter plot with numpy-masked arrays

Python scatter plot with numpy-masked arrays - python

I'm stuck trying to mask data for a scatter plot. All data seems to plot.
I'm using numpy arrays as shown in the snippet below. I'm thinking that perhaps I cannot mask on the "c" array. I can't seem to find any documentation for doing this. I'll try with the "s" array.
Any help is greatly appreciated.
yy = NP.ma.array(yy)
xx = NP.ma.array(xx)
zz_masked = NP.ma.masked_where(zz <= 1.0e6 , zz)
scatter(xx,yy,s=15,c=zz_masked, edgecolors='none')
cbar = colorbar()
show()

Works for me. Each call to scatter() gets its own colorbar since each scatter()'s colors are normalized to its own data. Which version of matplotlib are you using?
import pylab as plt
import numpy as np
x = np.linspace(0, 1, 100)
y = x**2
z = y
z_masked = np.ma.masked_where(z > 0.5, z)
plt.scatter(x, y, c=z, s=15, edgecolors='none')
plt.colorbar()
plt.scatter(x+1, y, c=z_masked, s=15, edgecolors='none')
plt.colorbar()
plt.show()

Related

3D surface graph with matplotlib using dataframe columns to input the data

I have a spreadsheet file that I would like to input to create a 3D surface graph using Matplotlib in Python.
I used plot_trisurf and it worked, but I need the projections of the contour profiles onto the graph that I can get with the surface function, like this example.
I'm struggling to arrange my Z data in a 2D array that I can use to input in the plot_surface method. I tried a lot of things, but none seems to work.
Here it is what I have working, using plot_trisurf
import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import pandas as pd
df=pd.read_excel ("/Users/carolethais/Desktop/Dissertação Carol/Códigos/Resultados/res_02_0.5.xlsx")
fig = plt.figure()
ax = fig.gca(projection='3d')
# I got the graph using trisurf
graf=ax.plot_trisurf(df["Diametro"],df["Comprimento"], df["temp_out"], cmap=matplotlib.cm.coolwarm)
ax.set_xlim(0, 0.5)
ax.set_ylim(0, 100)
ax.set_zlim(25,40)
fig.colorbar(graf, shrink=0.5, aspect=15)
ax.set_xlabel('Diâmetro (m)')
ax.set_ylabel('Comprimento (m)')
ax.set_zlabel('Temperatura de Saída (ºC)')
plt.show()
This is a part of my df, dataframe:
Diametro Comprimento temp_out
0 0.334294 0.787092 34.801994
1 0.334294 8.187065 32.465551
2 0.334294 26.155976 29.206090
3 0.334294 43.648591 27.792126
4 0.334294 60.768219 27.163233
... ... ... ...
59995 0.437266 14.113660 31.947302
59996 0.437266 25.208851 30.317583
59997 0.437266 33.823035 29.405461
59998 0.437266 57.724209 27.891616
59999 0.437266 62.455890 27.709298
I tried this approach to use the imported data with plot_surface, but what I got was indeed a graph but it didn't work, here it's the way the graph looked with this approach:
Thank you so much

A different approach, based on re-gridding the data, that doesn't require that the original data is specified on a regular grid [deeply inspired by this example;-].
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.tri as tri
from mpl_toolkits.mplot3d import Axes3D
np.random.seed(19880808)
# compute the sombrero over a cloud of random points
npts = 10000
x, y = np.random.uniform(-5, 5, npts), np.random.uniform(-5, 5, npts)
z = np.cos(1.5*np.sqrt(x*x + y*y))/(1+0.33*(x*x+y*y))
# prepare the interpolator
triang = tri.Triangulation(x, y)
interpolator = tri.LinearTriInterpolator(triang, z)
# do the interpolation
xi = yi = np.linspace(-5, 5, 101)
Xi, Yi = np.meshgrid(xi, yi)
Zi = interpolator(Xi, Yi)
# plotting
fig = plt.figure()
ax = fig.gca(projection='3d')
norm = plt.Normalize(-1,1)
ax.plot_surface(Xi, Yi, Zi,
cmap='inferno',
norm=plt.Normalize(-1,1))
plt.show()

plot_trisurf expects x, y, z as 1D arrays while plot_surface expects X, Y, Z as 2D arrays or as x, y, Z with x, y being 1D array and Z a 2D array.
Your data consists of 3 1D arrays, so plotting them with plot_trisurf is immediate but you need to use plot_surface to be able to project the isolines on the coordinate planes... You need to reshape your data.
It seems that you have 60000 data points, in the following I assume that you have a regular grid 300 points in the x direction and 200 points in y — but what is important is the idea of regular grid.
The code below shows
the use of plot_trisurf (with a coarser mesh), similar to your code;
the correct use of reshaping and its application in plot_surface;
note that the number of rows in reshaping corresponds to the number
of points in y and the number of columns to the number of points in x;
and 4. incorrect use of reshaping, the resulting subplots are somehow
similar to the plot you showed, maybe you just need to fix the number
of row and columns.
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
x, y = np.arange(30)/3.-5, np.arange(20)/2.-5
x, y = (arr.flatten() for arr in np.meshgrid(x, y))
z = np.cos(1.5*np.sqrt(x*x + y*y))/(1+0.1*(x*x+y*y))
fig, axes = plt.subplots(2, 2, subplot_kw={"projection" : "3d"})
axes = iter(axes.flatten())
ax = next(axes)
ax.plot_trisurf(x,y,z, cmap='Reds')
ax.set_title('Trisurf')
X, Y, Z = (arr.reshape(20,30) for arr in (x,y,z))
ax = next(axes)
ax.plot_surface(X,Y,Z, cmap='Reds')
ax.set_title('Surface 20×30')
X, Y, Z = (arr.reshape(30,20) for arr in (x,y,z))
ax = next(axes)
ax.plot_surface(X,Y,Z, cmap='Reds')
ax.set_title('Surface 30×20')
X, Y, Z = (arr.reshape(40,15) for arr in (x,y,z))
ax = next(axes)
ax.plot_surface(X,Y,Z, cmap='Reds')
ax.set_title('Surface 40×15')
plt.tight_layout()
plt.show()

Matplotlib colormap bug with length-4 arrays

I have some arrays that I need to plot in a loop with a certain colormap. However, one of my arrays is length-4, and I run into this problem:
import numpy as np
import matplotlib as plt
ns = range(2,8)
cm = plt.cm.get_cmap('spectral')
cmap = [cm(1.*i/len(ns)) for i in range(len(ns))]
for i,n in enumerate(ns):
x = np.linspace(0, 10, num=n)
y = np.zeros(n) + i
plt.scatter(x, y, c=cmap[i], edgecolor='none', s=50, label=n)
plt.legend(loc='lower left')
plt.show()
For n=4, it looks like Matplotlib is applying each element of the cmap RGBA-tuple to each value of the array. For the other length arrays, the behavior is expected.
Now, I actually have a much more complicated code and do not want to spend time rewriting the loop. Is there a workaround for this?

It looks like you've bumped into an unfortunate API design in the handling of the c argument. One way to work around the problem is to make c an array with shape (len(x), 4) containing len(x) copies of the desired color. E.g.
ns = range(2,8)
cm = plt.cm.get_cmap('spectral')
cmap = [cm(1.*i/len(ns)) for i in range(len(ns))]
for i,n in enumerate(ns):
x = np.linspace(0, 10, num=n)
y = np.zeros(n) + i
c = np.tile(cmap[i], (len(x), 1))
plt.scatter(x, y, c=c, edgecolor='none', s=50, label=n)
plt.legend(loc='lower left')
plt.show()
Another alternative is to convert the RBG values into a hex string, and pass the alpha channel of the color using the alpha argument. As #ali_m pointed out in a comment, the function matplotlib.colors.rgb2hex makes this easy. If you know the alpha channel of the color is always 1.0, you can remove the code that creates the alpha argument.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
ns = range(2,8)
cm = plt.cm.get_cmap('spectral')
cmap = [cm(1.*i/len(ns)) for i in range(len(ns))]
for i,n in enumerate(ns):
x = np.linspace(0, 10, num=n)
y = np.zeros(n) + i
c = mpl.colors.rgb2hex(cmap[i])
alpha = cmap[i][3]
plt.scatter(x, y, c=c, edgecolor='none', s=50, label=n, alpha=alpha)
plt.legend(loc='lower left')
plt.show()

Plotting a 2D heatmap

Using Matplotlib, I want to plot a 2D heat map. My data is an n-by-n Numpy array, each with a value between 0 and 1. So for the (i, j) element of this array, I want to plot a square at the (i, j) coordinate in my heat map, whose color is proportional to the element's value in the array.
How can I do this?

The imshow() function with parameters interpolation='nearest' and cmap='hot' should do what you want.
Please review the interpolation parameter details, and see Interpolations for imshow and Image antialiasing.
import matplotlib.pyplot as plt
import numpy as np
a = np.random.random((16, 16))
plt.imshow(a, cmap='hot', interpolation='nearest')
plt.show()

Seaborn is a high-level API for matplotlib, which takes care of a lot of the manual work.
seaborn.heatmap automatically plots a gradient at the side of the chart etc.
import numpy as np
import seaborn as sns
import matplotlib.pylab as plt
uniform_data = np.random.rand(10, 12)
ax = sns.heatmap(uniform_data, linewidth=0.5)
plt.show()
You can even plot upper / lower left / right triangles of square matrices. For example, a correlation matrix, which is square and is symmetric, so plotting all values would be redundant.
corr = np.corrcoef(np.random.randn(10, 200))
mask = np.zeros_like(corr)
mask[np.triu_indices_from(mask)] = True
with sns.axes_style("white"):
ax = sns.heatmap(corr, mask=mask, vmax=.3, square=True, cmap="YlGnBu")
plt.show()

I would use matplotlib's pcolor/pcolormesh function since it allows nonuniform spacing of the data.
Example taken from matplotlib:
import matplotlib.pyplot as plt
import numpy as np
# generate 2 2d grids for the x & y bounds
y, x = np.meshgrid(np.linspace(-3, 3, 100), np.linspace(-3, 3, 100))
z = (1 - x / 2. + x ** 5 + y ** 3) * np.exp(-x ** 2 - y ** 2)
# x and y are bounds, so z should be the value *inside* those bounds.
# Therefore, remove the last value from the z array.
z = z[:-1, :-1]
z_min, z_max = -np.abs(z).max(), np.abs(z).max()
fig, ax = plt.subplots()
c = ax.pcolormesh(x, y, z, cmap='RdBu', vmin=z_min, vmax=z_max)
ax.set_title('pcolormesh')
# set the limits of the plot to the limits of the data
ax.axis([x.min(), x.max(), y.min(), y.max()])
fig.colorbar(c, ax=ax)
plt.show()

For a 2d numpy array, simply use imshow() may help you:
import matplotlib.pyplot as plt
import numpy as np
def heatmap2d(arr: np.ndarray):
plt.imshow(arr, cmap='viridis')
plt.colorbar()
plt.show()
test_array = np.arange(100 * 100).reshape(100, 100)
heatmap2d(test_array)
This code produces a continuous heatmap.
You can choose another built-in colormap from here.

Here's how to do it from a csv:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import griddata
# Load data from CSV
dat = np.genfromtxt('dat.xyz', delimiter=' ',skip_header=0)
X_dat = dat[:,0]
Y_dat = dat[:,1]
Z_dat = dat[:,2]
# Convert from pandas dataframes to numpy arrays
X, Y, Z, = np.array([]), np.array([]), np.array([])
for i in range(len(X_dat)):
X = np.append(X, X_dat[i])
Y = np.append(Y, Y_dat[i])
Z = np.append(Z, Z_dat[i])
# create x-y points to be used in heatmap
xi = np.linspace(X.min(), X.max(), 1000)
yi = np.linspace(Y.min(), Y.max(), 1000)
# Interpolate for plotting
zi = griddata((X, Y), Z, (xi[None,:], yi[:,None]), method='cubic')
# I control the range of my colorbar by removing data
# outside of my range of interest
zmin = 3
zmax = 12
zi[(zi<zmin) | (zi>zmax)] = None
# Create the contour plot
CS = plt.contourf(xi, yi, zi, 15, cmap=plt.cm.rainbow,
vmax=zmax, vmin=zmin)
plt.colorbar()
plt.show()
where dat.xyz is in the form
x1 y1 z1
x2 y2 z2
...

Use matshow() which is a wrapper around imshow to set useful defaults for displaying a matrix.
a = np.diag(range(15))
plt.matshow(a)
https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.matshow.html
This is just a convenience function wrapping imshow to set useful defaults for displaying a matrix. In particular:
Set origin='upper'.
Set interpolation='nearest'.
Set aspect='equal'.
Ticks are placed to the left and above.
Ticks are formatted to show integer indices.

Here is a new python package to plot complex heatmaps with different kinds of row/columns annotations in Python: https://github.com/DingWB/PyComplexHeatmap

Transition line in heat map - python

I have problem that I can't seem to work around. I have a grid of values that I have interpolated using scipys griddata. The values have been visualized as a heat map with values in [0,1]. Now I would like to plot a transition line for values 1/2.
Is this possible? My first idea was to extract the coordinates from grid_z that corresponds to 1/2 and using the coordinates for a line plot, but I'm not sure how to do that.
Thank you in advance.
EDIT: Solved it via
xInd, yInd = np.where(np.logical_and(grid_z.T > 0.49, grid_z.T < 0.51))
and then plotting the line!

You can use contour() for that:
import numpy
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
x = numpy.linspace(0, 2*numpy.pi, 200)
y = numpy.linspace(0, 2*numpy.pi, 200)
xx, yy = numpy.meshgrid(x, y)
z = numpy.sin(xx) * numpy.cos(yy)
fig = plt.figure()
s = fig.add_subplot(1, 1, 1)
s.imshow(z, vmin=0, vmax=1)
s.contour(z, levels=[0.5])
fig.savefig('t.png')

Python/matplotlib mplot3d- how do I set a maximum value for the z-axis?

I am trying to make a 3-dimensional surface plot for the expression: z = y^2/x, for x in the interval [-2,2] and y in the interval [-1.4,1.4]. I also want the z-values to range from -4 to 4.
The problem is that when I'm viewing the finished surfaceplot, the z-axis values do not stop at [-4,4].
So my question is how I can "remove" the z-axis value that range outside the intervall [-4,4] from the finished plot?
My code is:
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.gca(projection="3d")
x = np.arange(-2.0,2.0,0.1,float) # x in interval [-2,2]
y = np.arange(-1.4,1.4,0.1,float) # y in interval [-1.4,1.4]
x,y = np.meshgrid(x,y)
z = (y**2/x) # z = y^2/x
ax.plot_surface(x, y, z,rstride=1, cstride=1, linewidth=0.25)
ax.set_zlim3d(-4, 4) # viewrange for z-axis should be [-4,4]
ax.set_ylim3d(-2, 2) # viewrange for y-axis should be [-2,2]
ax.set_xlim3d(-2, 2) # viewrange for x-axis should be [-2,2]
plt.show()

I am having the same issue and still have not found anything better than clipping my data. Unfortunately in my case I am tied to matplotlib 1.2.1. But in case you can upgrade to version 1.3.0 you could have a solution: it seems there is a bunch of new API related to axes ranges. In particular, you may be interested by the "set_zlim".
Edit 1: Manage to migrate my environnement to use matplotlib 1.3.0; set_zlim worked like a charm :)
The follwing code worked for me (By the way I am running this on OSX, I am not sure this has an impact?):
# ----------------------------------------------------------------------------
# Make a 3d plot according to data passed as arguments
def Plot3DMap( self, LabelX, XRange, LabelY, YRange, LabelZ, data3d ) :
fig = plt.figure()
ax = fig.add_subplot( 111, projection="3d" )
xs, ys = np.meshgrid( XRange, YRange )
surf = ax.plot_surface( xs, ys, data3d )
ax.set_xlabel( LabelX )
ax.set_ylabel( LabelY )
ax.set_zlabel( LabelZ )
ax.set_zlim(0, 100)
plt.show()

clipping your data will accomplish this, but it's not very pretty.
z[z>4]= np.nan
z[z<-4]= np.nan

Rather than using ax.plot_surface I found ax.plot_trisurf to work well, since you don't need to give it a rectangular grid of values like ax.plot_surface. If you're using numpy arrays, you can then use the following trick to only select points within your z-bounds.
from matplotlib import cm
x, y, z = x.flatten(), y.flatten(), z.flatten()
usable_points = (-4 < z) & (z < 4)
x, y, z = x[usable_points], y[usable_points], z[usable_points]
ax.plot_trisurf(x, y, z, cmap=cm.jet)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python scatter plot with numpy-masked arrays - python

Related

3D surface graph with matplotlib using dataframe columns to input the data

Matplotlib colormap bug with length-4 arrays

Plotting a 2D heatmap

Transition line in heat map - python

Python/matplotlib mplot3d- how do I set a maximum value for the z-axis?

Categories

Resources