I have a dataframe like this:
import random
import matplotlib.pyplot as plt
plt.style.use('ggplot')
fig = plt.figure(figsize=(16,8))
import pandas as pd
data = pd.DataFrame({"X":random.sample(range(530000, 560000), 60),
"Y":random.sample(range(8580000, 8620000), 60),
"PROPERTY":random.choices(range(0, 30), k=60)})
I saw an example where I could plot my PROPERTY along X and Y coordinates as a triangle spatial distribution:
x = data["X"]
y = data["Y"]
z = data["PROPERTY"]
# Plot Triangular Color Filled Contour
plt.tricontourf(x, y, z, cmap="rainbow")
plt.colorbar()
plt.tricontour(x, y, z)
# Set well shapes
plt.scatter(x, y, color='black')
plt.xlabel("X")
plt.ylabel("Y")
Althoug I would like to plot it as a different map type, not with these abrupt data transitions. Maybe like kriging or smooth interpolation like this example:
Anyone could show me an example?
I used the pykrige package to interpolate the point data into a grid field.
The code and output figure are here.
import random
import matplotlib.pyplot as plt
plt.style.use('ggplot')
fig = plt.figure(figsize=(6,4))
import pandas as pd
from pykrige import OrdinaryKriging
import numpy as np
random.seed(100)
data = pd.DataFrame({"X":random.sample(range(530000, 560000), 60),
"Y":random.sample(range(8580000, 8620000), 60),
"PROPERTY":random.choices(range(0, 30), k=60)})
x = data["X"]
y = data["Y"]
z = data["PROPERTY"]
x1 = np.linspace(530000.,560000,700)
y1 = np.linspace(8580000,8620000,400)
dict1= {'sill': 1, 'range': 6500.0, 'nugget': .1}
OK = OrdinaryKriging(x,y,z,variogram_model='gaussian',
variogram_parameters=dict1,nlags=6)
zgrid,ss = OK.execute('grid',x1,y1)
xgrid,ygrid = np.meshgrid(x1,y1)
# Plot Triangular Color Filled Contour
# plt.tricontourf(x, y, z, cmap="rainbow")
plt.contourf(xgrid, ygrid, zgrid, cmap="rainbow")
plt.colorbar()
# Set well shapes
plt.scatter(x, y, color='black')
plt.xlabel("X")
plt.ylabel("Y")
Related
I have a spreadsheet file that I would like to input to create a 3D surface graph using Matplotlib in Python.
I used plot_trisurf and it worked, but I need the projections of the contour profiles onto the graph that I can get with the surface function, like this example.
I'm struggling to arrange my Z data in a 2D array that I can use to input in the plot_surface method. I tried a lot of things, but none seems to work.
Here it is what I have working, using plot_trisurf
import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import pandas as pd
df=pd.read_excel ("/Users/carolethais/Desktop/Dissertação Carol/Códigos/Resultados/res_02_0.5.xlsx")
fig = plt.figure()
ax = fig.gca(projection='3d')
# I got the graph using trisurf
graf=ax.plot_trisurf(df["Diametro"],df["Comprimento"], df["temp_out"], cmap=matplotlib.cm.coolwarm)
ax.set_xlim(0, 0.5)
ax.set_ylim(0, 100)
ax.set_zlim(25,40)
fig.colorbar(graf, shrink=0.5, aspect=15)
ax.set_xlabel('Diâmetro (m)')
ax.set_ylabel('Comprimento (m)')
ax.set_zlabel('Temperatura de Saída (ºC)')
plt.show()
This is a part of my df, dataframe:
Diametro Comprimento temp_out
0 0.334294 0.787092 34.801994
1 0.334294 8.187065 32.465551
2 0.334294 26.155976 29.206090
3 0.334294 43.648591 27.792126
4 0.334294 60.768219 27.163233
... ... ... ...
59995 0.437266 14.113660 31.947302
59996 0.437266 25.208851 30.317583
59997 0.437266 33.823035 29.405461
59998 0.437266 57.724209 27.891616
59999 0.437266 62.455890 27.709298
I tried this approach to use the imported data with plot_surface, but what I got was indeed a graph but it didn't work, here it's the way the graph looked with this approach:
Thank you so much
A different approach, based on re-gridding the data, that doesn't require that the original data is specified on a regular grid [deeply inspired by this example;-].
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.tri as tri
from mpl_toolkits.mplot3d import Axes3D
np.random.seed(19880808)
# compute the sombrero over a cloud of random points
npts = 10000
x, y = np.random.uniform(-5, 5, npts), np.random.uniform(-5, 5, npts)
z = np.cos(1.5*np.sqrt(x*x + y*y))/(1+0.33*(x*x+y*y))
# prepare the interpolator
triang = tri.Triangulation(x, y)
interpolator = tri.LinearTriInterpolator(triang, z)
# do the interpolation
xi = yi = np.linspace(-5, 5, 101)
Xi, Yi = np.meshgrid(xi, yi)
Zi = interpolator(Xi, Yi)
# plotting
fig = plt.figure()
ax = fig.gca(projection='3d')
norm = plt.Normalize(-1,1)
ax.plot_surface(Xi, Yi, Zi,
cmap='inferno',
norm=plt.Normalize(-1,1))
plt.show()
plot_trisurf expects x, y, z as 1D arrays while plot_surface expects X, Y, Z as 2D arrays or as x, y, Z with x, y being 1D array and Z a 2D array.
Your data consists of 3 1D arrays, so plotting them with plot_trisurf is immediate but you need to use plot_surface to be able to project the isolines on the coordinate planes... You need to reshape your data.
It seems that you have 60000 data points, in the following I assume that you have a regular grid 300 points in the x direction and 200 points in y — but what is important is the idea of regular grid.
The code below shows
the use of plot_trisurf (with a coarser mesh), similar to your code;
the correct use of reshaping and its application in plot_surface;
note that the number of rows in reshaping corresponds to the number
of points in y and the number of columns to the number of points in x;
and 4. incorrect use of reshaping, the resulting subplots are somehow
similar to the plot you showed, maybe you just need to fix the number
of row and columns.
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
x, y = np.arange(30)/3.-5, np.arange(20)/2.-5
x, y = (arr.flatten() for arr in np.meshgrid(x, y))
z = np.cos(1.5*np.sqrt(x*x + y*y))/(1+0.1*(x*x+y*y))
fig, axes = plt.subplots(2, 2, subplot_kw={"projection" : "3d"})
axes = iter(axes.flatten())
ax = next(axes)
ax.plot_trisurf(x,y,z, cmap='Reds')
ax.set_title('Trisurf')
X, Y, Z = (arr.reshape(20,30) for arr in (x,y,z))
ax = next(axes)
ax.plot_surface(X,Y,Z, cmap='Reds')
ax.set_title('Surface 20×30')
X, Y, Z = (arr.reshape(30,20) for arr in (x,y,z))
ax = next(axes)
ax.plot_surface(X,Y,Z, cmap='Reds')
ax.set_title('Surface 30×20')
X, Y, Z = (arr.reshape(40,15) for arr in (x,y,z))
ax = next(axes)
ax.plot_surface(X,Y,Z, cmap='Reds')
ax.set_title('Surface 40×15')
plt.tight_layout()
plt.show()
I am trying to use the colormap feature of a 3d-surface plot in matplotlib to color the surface based on values from another array instead of the z-values.
The surface plot is created and displayed as follows:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def gauss(x, y, w_0):
r = np.sqrt(x**2 + y**2)
return np.exp(-2*r**2 / w_0**2)
x = np.linspace(-100, 100, 100)
y = np.linspace(-100, 100, 100)
X, Y = np.meshgrid(x, y)
Z = gauss(X, Y, 50)
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
ax.plot_surface(X, Y, Z, cmap='jet')
Now instead of coloring based on elevation of the 3d-surface, I am looking to supply the color data for the surface in form of another array, here as an example a random one:
color_data = np.random.uniform(0, 1, size=(Z.shape))
However, I did not find a solution to colorize the 3d-surface based on those values. Ideally, it would look like a contourf plot in 3d, just on the 3d surface.
You can use matplotlib.colors.from_levels_and_colors to obtain a colormap and normalization, then apply those to the values to be colormapped.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.colors
x = np.linspace(-100, 100, 101)
y = np.linspace(-100, 100, 101)
X, Y = np.meshgrid(x, y)
Z = np.exp(-2*np.sqrt(X**2 + Y**2)**2 / 50**2)
c = X+50*np.cos(Y/20) # values to be colormapped
N = 11 # Number of level (edges)
levels = np.linspace(-150,150,N)
colors = plt.cm.get_cmap("RdYlGn", N-1)(np.arange(N-1))
cmap, norm = matplotlib.colors.from_levels_and_colors(levels, colors)
color_vals = cmap(norm(c))
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
ax.plot_surface(X, Y, Z, facecolors=color_vals, rstride=1, cstride=1)
plt.show()
I have a data file in NumPy array, I would like to view the 3D-image. I am sharing an example, where I can view 2D image of size (100, 100), this is a slice in xy-plane at z = 0.
import numpy as np
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
X, Y, Z = np.mgrid[-10:10:100j, -10:10:100j, -10:10:100j]
T = np.sin(X*Y*Z)/(X*Y*Z)
T=T[:,:,0]
im = plt.imshow(T, cmap='hot')
plt.colorbar(im, orientation='vertical')
plt.show()
How can I view a 3D image of the data T of shape (100, 100, 100)?
I think the main problem is, that you do have 4 informations for each point, so you are actually interessted in a 4-dimensional object. Plotting this is always difficult (maybe even impossible). I suggest one of the following solutions:
You change the question to: I'm not interessted in all combinations of x,y,z, but only the ones, where z = f(x,y)
You change the accuracy of you plot a bit, saying that you don't need 100 levels of z, but only maybe 5, then you simply make 5 of the plots you already have.
In case you want to use the first method, then there are several submethods:
A. Plot the 2-dim surface f(x,y)=z and color it with T
B. Use any technic that is used to plot complex functions, for more info see here.
The plot given by method 1.A (which I think is the best solution) with z=x^2+y^2 yields:
I used this programm:
import numpy as np
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib as mpl
X, Y = np.mgrid[-10:10:100j, -10:10:100j]
Z = (X**2+Y**2)/10 #definition of f
T = np.sin(X*Y*Z)
norm = mpl.colors.Normalize(vmin=np.amin(T), vmax=np.amax(T))
T = mpl.cm.hot(T) #change T to colors
fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(X, Y, Z, facecolors=T, linewidth=0,
cstride = 1, rstride = 1)
plt.show()
The second method gives something like:
With the code:
norm = mpl.colors.Normalize(vmin=-1, vmax=1)
X, Y= np.mgrid[-10:10:101j, -10:10:101j]
fig = plt.figure()
ax = fig.gca(projection='3d')
for i in np.linspace(-1,1,5):
Z = np.zeros(X.shape)+i
T = np.sin(X*Y*Z)
T = mpl.cm.hot(T)
ax.plot_surface(X, Y, Z, facecolors=T, linewidth=0, alpha = 0.5, cstride
= 10, rstride = 10)
plt.show()
Note: I changed the function to T = sin(X*Y*Z) because dividing by X*Y*Zmakes the functions behavior bad, as you divide two number very close to 0.
I have got a solution to my question. If we have the NumPy data, then we can convert them into TVTK ImageData and then visualization is possible with the help of mlab form Mayavi. The code and its 3D visualization are the following
from tvtk.api import tvtk
import numpy as np
from mayavi import mlab
X, Y, Z = np.mgrid[-10:10:100j, -10:10:100j, -10:10:100j]
data = np.sin(X*Y*Z)/(X*Y*Z)
i = tvtk.ImageData(spacing=(1, 1, 1), origin=(0, 0, 0))
i.point_data.scalars = data.ravel()
i.point_data.scalars.name = 'scalars'
i.dimensions = data.shape
mlab.pipeline.surface(i)
mlab.colorbar(orientation='vertical')
mlab.show()
For another randomly generated data
from numpy import random
data = random.random((20, 20, 20))
The visualization will be
I'd like to make a scatter plot where each point is colored by the spatial density of nearby points.
I've come across a very similar question, which shows an example of this using R:
R Scatter Plot: symbol color represents number of overlapping points
What's the best way to accomplish something similar in python using matplotlib?
In addition to hist2d or hexbin as #askewchan suggested, you can use the same method that the accepted answer in the question you linked to uses.
If you want to do that:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
# Generate fake data
x = np.random.normal(size=1000)
y = x * 3 + np.random.normal(size=1000)
# Calculate the point density
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
fig, ax = plt.subplots()
ax.scatter(x, y, c=z, s=100)
plt.show()
If you'd like the points to be plotted in order of density so that the densest points are always on top (similar to the linked example), just sort them by the z-values. I'm also going to use a smaller marker size here as it looks a bit better:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
# Generate fake data
x = np.random.normal(size=1000)
y = x * 3 + np.random.normal(size=1000)
# Calculate the point density
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
# Sort the points by density, so that the densest points are plotted last
idx = z.argsort()
x, y, z = x[idx], y[idx], z[idx]
fig, ax = plt.subplots()
ax.scatter(x, y, c=z, s=50)
plt.show()
Plotting >100k data points?
The accepted answer, using gaussian_kde() will take a lot of time. On my machine, 100k rows took about 11 minutes. Here I will add two alternative methods (mpl-scatter-density and datashader) and compare the given answers with same dataset.
In the following, I used a test data set of 100k rows:
import matplotlib.pyplot as plt
import numpy as np
# Fake data for testing
x = np.random.normal(size=100000)
y = x * 3 + np.random.normal(size=100000)
Output & computation time comparison
Below is a comparison of different methods.
1: mpl-scatter-density
Installation
pip install mpl-scatter-density
Example code
import mpl_scatter_density # adds projection='scatter_density'
from matplotlib.colors import LinearSegmentedColormap
# "Viridis-like" colormap with white background
white_viridis = LinearSegmentedColormap.from_list('white_viridis', [
(0, '#ffffff'),
(1e-20, '#440053'),
(0.2, '#404388'),
(0.4, '#2a788e'),
(0.6, '#21a784'),
(0.8, '#78d151'),
(1, '#fde624'),
], N=256)
def using_mpl_scatter_density(fig, x, y):
ax = fig.add_subplot(1, 1, 1, projection='scatter_density')
density = ax.scatter_density(x, y, cmap=white_viridis)
fig.colorbar(density, label='Number of points per pixel')
fig = plt.figure()
using_mpl_scatter_density(fig, x, y)
plt.show()
Drawing this took 0.05 seconds:
And the zoom-in looks quite nice:
2: datashader
Datashader is an interesting project. It has added support for matplotlib in datashader 0.12.
Installation
pip install datashader
Code (source & parameterer listing for dsshow):
import datashader as ds
from datashader.mpl_ext import dsshow
import pandas as pd
def using_datashader(ax, x, y):
df = pd.DataFrame(dict(x=x, y=y))
dsartist = dsshow(
df,
ds.Point("x", "y"),
ds.count(),
vmin=0,
vmax=35,
norm="linear",
aspect="auto",
ax=ax,
)
plt.colorbar(dsartist)
fig, ax = plt.subplots()
using_datashader(ax, x, y)
plt.show()
It took 0.83 s to draw this:
There is also possibility to colorize by third variable. The third parameter for dsshow controls the coloring. See more examples here and the source for dsshow here.
3: scatter_with_gaussian_kde
def scatter_with_gaussian_kde(ax, x, y):
# https://stackoverflow.com/a/20107592/3015186
# Answer by Joel Kington
xy = np.vstack([x, y])
z = gaussian_kde(xy)(xy)
ax.scatter(x, y, c=z, s=100, edgecolor='')
It took 11 minutes to draw this:
4: using_hist2d
import matplotlib.pyplot as plt
def using_hist2d(ax, x, y, bins=(50, 50)):
# https://stackoverflow.com/a/20105673/3015186
# Answer by askewchan
ax.hist2d(x, y, bins, cmap=plt.cm.jet)
It took 0.021 s to draw this bins=(50,50):
It took 0.173 s to draw this bins=(1000,1000):
Cons: The zoomed-in data does not look as good as in with mpl-scatter-density or datashader. Also you have to determine the number of bins yourself.
5: density_scatter
The code is as in the answer by Guillaume.
It took 0.073 s to draw this with bins=(50,50):
It took 0.368 s to draw this with bins=(1000,1000):
Also, if the number of point makes KDE calculation too slow, color can be interpolated in np.histogram2d [Update in response to comments: If you wish to show the colorbar, use plt.scatter() instead of ax.scatter() followed by plt.colorbar()]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from matplotlib.colors import Normalize
from scipy.interpolate import interpn
def density_scatter( x , y, ax = None, sort = True, bins = 20, **kwargs ) :
"""
Scatter plot colored by 2d histogram
"""
if ax is None :
fig , ax = plt.subplots()
data , x_e, y_e = np.histogram2d( x, y, bins = bins, density = True )
z = interpn( ( 0.5*(x_e[1:] + x_e[:-1]) , 0.5*(y_e[1:]+y_e[:-1]) ) , data , np.vstack([x,y]).T , method = "splinef2d", bounds_error = False)
#To be sure to plot all data
z[np.where(np.isnan(z))] = 0.0
# Sort the points by density, so that the densest points are plotted last
if sort :
idx = z.argsort()
x, y, z = x[idx], y[idx], z[idx]
ax.scatter( x, y, c=z, **kwargs )
norm = Normalize(vmin = np.min(z), vmax = np.max(z))
cbar = fig.colorbar(cm.ScalarMappable(norm = norm), ax=ax)
cbar.ax.set_ylabel('Density')
return ax
if "__main__" == __name__ :
x = np.random.normal(size=100000)
y = x * 3 + np.random.normal(size=100000)
density_scatter( x, y, bins = [30,30] )
You could make a histogram:
import numpy as np
import matplotlib.pyplot as plt
# fake data:
a = np.random.normal(size=1000)
b = a*3 + np.random.normal(size=1000)
plt.hist2d(a, b, (50, 50), cmap=plt.cm.jet)
plt.colorbar()
When I plot something with contourf, I see at the bottom of the plot window the current x and y values under the mouse cursor.
Is there a way to see also the z value?
Here an example contourf:
import matplotlib.pyplot as plt
import numpy as hp
plt.contourf(np.arange(16).reshape(-1,4))
The text that shows the position of the cursor is generated by ax.format_coord. You can override the method to also display a z-value. For instance,
import matplotlib.pyplot as plt
import numpy as np
import scipy.interpolate as si
data = np.arange(16).reshape(-1, 4)
X, Y = np.mgrid[:data.shape[0], :data.shape[1]]
cs = plt.contourf(X, Y, data)
func = si.interp2d(X, Y, data)
def fmt(x, y):
z = np.take(func(x, y), 0)
return 'x={x:.5f} y={y:.5f} z={z:.5f}'.format(x=x, y=y, z=z)
plt.gca().format_coord = fmt
plt.show()
The documentation example shows how you can insert z-value labels into your plot
Script: http://matplotlib.sourceforge.net/mpl_examples/pylab_examples/contour_demo.py
Basically, it's
plt.figure()
CS = plt.contour(X, Y, Z)
plt.clabel(CS, inline=1, fontsize=10)
plt.title('Simplest default with labels')
Just a variant of wilywampa's answer. If you already have a pre-computed grid of interpolated contour values because your data is sparse or if you have a huge data matrix, this might be suitable for you.
import matplotlib.pyplot as plt
import numpy as np
resolution = 100
Z = np.arange(resolution**2).reshape(-1, resolution)
X, Y = np.mgrid[:Z.shape[0], :Z.shape[1]]
cs = plt.contourf(X, Y, Z)
Xflat, Yflat, Zflat = X.flatten(), Y.flatten(), Z.flatten()
def fmt(x, y):
# get closest point with known data
dist = np.linalg.norm(np.vstack([Xflat - x, Yflat - y]), axis=0)
idx = np.argmin(dist)
z = Zflat[idx]
return 'x={x:.5f} y={y:.5f} z={z:.5f}'.format(x=x, y=y, z=z)
plt.colorbar()
plt.gca().format_coord = fmt
plt.show()
Ex: