Related
I am trying to make a figure to visualize Lagranges multiplier method. This means I want to draw the graph of some function z = f(x,y), but also the constraint g(x,y) = c. Because I want to draw the graph of f, this must obviously be a 3D plot. But the constraint g(x,y) = c is a level curve of g, and should lie in the xy-plane.
I am using Python, and here is my current code:
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np
fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
X = np.arange(-5,5,0.5)
Y = X
X, Y = np.meshgrid(X, Y)
Z = 50 - X**2 - Y**2
surf = ax.plot_surface(X, Y, Z, cmap=cm.coolwarm)
ax.set_zlim(0, 50)
g = X**2+Y**2
CS = ax.contour(X,Y,g)
plt.show()
and this is the output:
Current plot
I only need one level curve of g in the xy-plane. Now, I have several, and none of them lies at z = 0. Ideally, I should also somehow mark out the points of z=f(x,y) that lies directly over g(x,y) = c.
I would really appreciate your feedback!
You need to add the optional argument "offset", so that the contour gets projected to a plane. To be in z=0:
CS = ax.contour(X,Y,g, offset = 0)
See here.
For my report, I'm creating a special color plot in jupyter notebook. There are two parameters, x and y.
import numpy as np
x = np.arange(-1,1,0.1)
y = np.arange(1,11,1)
with which I compute a third quantity. Here is an example to demonstrate the concept:
values = []
for i in range(len(y)) :
z = y[i] * x**3
# in my case the value z represents phases of oscillators
# so I will transform the computed values to the intervall [0,2pi)
values.append(z)
values = np.array(values) % 2*np.pi
I'm plotting y vs x. For each y = 1,2,3,4... there will be a horizontal line with total length two. For example: The coordinate (0.5,8) stands for a single point on line 8 at position x = 0.5 and z(0.5,8) is its associated value.
Now I want to represent each point on all ten lines with a unique color that is determined by z(x,y). Since z(x,y) takes only values in [0,2pi) I need a color scheme that starts at zero (for example z=0 corresponds to blue). For increasing z the color continuously changes and in the end at 2pi it takes the same color again (so at z ~ 2pi it becomes blue again).
Does someone know how this can be done in python?
The kind of structure for x, y and z you need, is easier using a meshgrid. Also, to have a lot of x-values between -1 and 1, np.linspace(-1,1,N) divides the range in N even intervals.
Using meshgrid, z can be calculated in one line using numpy's vectorization. This runs much faster.
To set a repeating color, a cyclic colormap such as hsv can be used. There the last color is the same as the starting color.
import numpy as np
from matplotlib import pyplot as plt
x, y = np.meshgrid(np.linspace(-1,1,100), np.arange(1,11,1))
z = (y * x**3) % 2*np.pi
plt.scatter(x, y, c=z, s=6, cmap='hsv')
plt.yticks(range(1,11))
plt.show()
Alternatively, a symmetric colormap could be built taken the colors from and existing map and combining them with the same colors in reverse order.
import numpy as np
from matplotlib import pyplot as plt
import matplotlib.colors as mcolors
colors_orig = plt.cm.viridis_r(np.linspace(0, 1, 128))
# combine the colors with the reversed array and build a new colormap
colors = np.vstack((colors_orig, colors_orig[::-1]))
symcmap = mcolors.LinearSegmentedColormap.from_list('symcmap', colors)
x, y = np.meshgrid(np.linspace(-1,1,100), np.arange(1,11,1))
z = (y * x**3) % 2*np.pi
plt.scatter(x, y, c=z, s=6, cmap=symcmap)
plt.yticks(range(1,11))
plt.show()
Multicolored lines are somewhat more complicated than just scatter plots. The docs have an example using LineCollections. Here is the adapted code. Note that the line segments are colored using their start point, so make sure there are enough x values. Also, the x and y limits aren't set automatically any more.
The code also adds a colorbar to illustrate how the colors map to the z values. Some interesting code from Jake VanderPlas shows how to create ticks for multiples of π.
import numpy as np
from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection
# code from Jake VanderPlas
def format_func(value, tick_number):
# find number of multiples of pi/2
N = int(np.round(2 * value / np.pi))
if N == 0:
return "0"
elif N == 1:
return r"$\pi/2$"
elif N == 2:
return r"$\pi$"
elif N % 2 > 0:
return r"${0}\pi/2$".format(N)
else:
return r"${0}\pi$".format(N // 2)
x = np.linspace(-1, 1, 500)
y_max = 10
# Create a continuous norm to map from data points to colors
norm = plt.Normalize(0, 2 * np.pi)
for y in range(1, y_max + 1):
z = (y * x ** 3) % 2 * np.pi
y_array = y * np.ones_like(x)
points = np.array([x, y_array]).T.reshape(-1, 1, 2)
segments = np.concatenate([points[:-1], points[1:]], axis=1)
lc = LineCollection(segments, cmap='hsv', norm=norm)
lc.set_array(z) # Set the values used for colormapping
lc.set_linewidth(2)
line = plt.gca().add_collection(lc)
# plt.scatter(x, y_array, c=z, s=10, norm=norm, cmap='hsv')
cbar = plt.colorbar(line) # , ticks=[k*np.pi for k in np.arange(0, 2.001, 0.25)])
cbar.locator = plt.MultipleLocator(np.pi / 2)
cbar.minor_locator = plt.MultipleLocator(np.pi / 4)
cbar.formatter = plt.FuncFormatter(format_func)
cbar.ax.minorticks_on()
cbar.update_ticks()
plt.yticks(range(1, y_max + 1)) # one tick for every y
plt.xlim(x.min(), x.max()) # the LineCollection doesn't force the limits
plt.ylim(0.5, y_max + 0.5)
plt.show()
I would like to plot the contour lines for this function, however I cannot find any useful way to do it.
The potential function is :
V(x,y,z) = cos(10x) + cos(10y) + cos(10z) + 2*(x^2 + y^2 + z^2)
I unsuccessfully attempted something like:
import numpy
import matplotlib.pyplot.contour
def V(x,y,z):
return numpy.cos(10*x) + numpy.cos(10*y) + numpy.cos(10*z) + 2*(x**2 + y**2 + z**2)
X, Y, Z = numpy.mgrid[-1:1:100j, -1:1:100j, -1:1:100j]
But then, I don't know how to do the next step to plot it?
matplotlib.pyplot.contour(X,Y,Z,V)
An error will arise when you try to pass contour three-dimensional arrays, as it expects two-dimensional arrays.
With this in mind, try:
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np
def V(x,y,z):
return np.cos(10*x) + np.cos(10*y) + np.cos(10*z) + 2*(x**2 + y**2 + z**2)
X,Y = np.mgrid[-1:1:100j, -1:1:100j]
Z_vals = [ -0.5, 0, 0.9 ]
num_subplots = len( Z_vals)
fig = plt.figure(figsize=(10, 4))
for i,z in enumerate( Z_vals):
ax = fig.add_subplot(1 , num_subplots , i+1, projection='3d')
ax.contour(X, Y, V(X,Y,z), cmap=cm.gnuplot)
ax.set_title('z = %.2f'%z, fontsize=30)
fig.savefig('contours.png', facecolor='grey', edgecolor='none')
Instead, use ax.contourf(...) to show the surfaces, which looks nicer in my opinion.
There is no direct way to visualize a function of 3 variables, as it is an object (surface) which lives in 4 dimensions. One must play with slices of the function to see what's going on. By a slice, I mean a projection of the function onto a lower dimensional space. A slice is achieved by setting one or more of the function variables as a constant.
I'm not sure this is what the OP needed, but I think a possible solution might be this one:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
def compute_torus(precision, c, a):
U = np.linspace(0, 2*np.pi, precision)
V = np.linspace(0, 2*np.pi, precision)
U, V = np.meshgrid(U, V)
X = (c+a*np.cos(V))*np.cos(U)
Y = (c+a*np.cos(V))*np.sin(U)
Z = a*np.sin(V)
return X, Y, Z
x, y, z = compute_torus(100, 2, 1)
fig = plt.figure()
color_dimension = z # Here goes the potential
minn, maxx = color_dimension.min(), color_dimension.max()
norm = matplotlib.colors.Normalize(minn, maxx)
m = plt.cm.ScalarMappable(norm=norm, cmap='jet')
m.set_array([])
fcolors = m.to_rgba(color_dimension)
# plot
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(x,y,z, rstride=1, cstride=1, facecolors=fcolors, vmin=minn, vmax=maxx, shade=False)
Setting color_dimension to the values of the potential function, using this code can be plotted over a torus. In general, it can be plotted over any 3D shape of (x,y,z), but of course if the 3D space is fully filled with points everywhere, it's unlikely the image will be clear.
I'd like to plot implicit equation F(x,y,z) = 0 in 3D. Is it possible in Matplotlib?
You can trick matplotlib into plotting implicit equations in 3D. Just make a one-level contour plot of the equation for each z value within the desired limits. You can repeat the process along the y and z axes as well for a more solid-looking shape.
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
import numpy as np
def plot_implicit(fn, bbox=(-2.5,2.5)):
''' create a plot of an implicit function
fn ...implicit function (plot where fn==0)
bbox ..the x,y,and z limits of plotted interval'''
xmin, xmax, ymin, ymax, zmin, zmax = bbox*3
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
A = np.linspace(xmin, xmax, 100) # resolution of the contour
B = np.linspace(xmin, xmax, 15) # number of slices
A1,A2 = np.meshgrid(A,A) # grid on which the contour is plotted
for z in B: # plot contours in the XY plane
X,Y = A1,A2
Z = fn(X,Y,z)
cset = ax.contour(X, Y, Z+z, [z], zdir='z')
# [z] defines the only level to plot for this contour for this value of z
for y in B: # plot contours in the XZ plane
X,Z = A1,A2
Y = fn(X,y,Z)
cset = ax.contour(X, Y+y, Z, [y], zdir='y')
for x in B: # plot contours in the YZ plane
Y,Z = A1,A2
X = fn(x,Y,Z)
cset = ax.contour(X+x, Y, Z, [x], zdir='x')
# must set plot limits because the contour will likely extend
# way beyond the displayed level. Otherwise matplotlib extends the plot limits
# to encompass all values in the contour.
ax.set_zlim3d(zmin,zmax)
ax.set_xlim3d(xmin,xmax)
ax.set_ylim3d(ymin,ymax)
plt.show()
Here's the plot of the Goursat Tangle:
def goursat_tangle(x,y,z):
a,b,c = 0.0,-5.0,11.8
return x**4+y**4+z**4+a*(x**2+y**2+z**2)**2+b*(x**2+y**2+z**2)+c
plot_implicit(goursat_tangle)
You can make it easier to visualize by adding depth cues with creative colormapping:
Here's how the OP's plot looks:
def hyp_part1(x,y,z):
return -(x**2) - (y**2) + (z**2) - 1
plot_implicit(hyp_part1, bbox=(-100.,100.))
Bonus: You can use python to functionally combine these implicit functions:
def sphere(x,y,z):
return x**2 + y**2 + z**2 - 2.0**2
def translate(fn,x,y,z):
return lambda a,b,c: fn(x-a,y-b,z-c)
def union(*fns):
return lambda x,y,z: np.min(
[fn(x,y,z) for fn in fns], 0)
def intersect(*fns):
return lambda x,y,z: np.max(
[fn(x,y,z) for fn in fns], 0)
def subtract(fn1, fn2):
return intersect(fn1, lambda *args:-fn2(*args))
plot_implicit(union(sphere,translate(sphere, 1.,1.,1.)), (-2.,3.))
Update: I finally have found an easy way to render 3D implicit surface with matplotlib and scikit-image, see my other answer. I left this one for whom is interested in plotting parametric 3D surfaces.
Motivation
Late answer, I just needed to do the same and I found another way to do it at some extent. So I am sharing this another perspective.
This post does not answer: (1) How to plot any implicit function F(x,y,z)=0? But does answer: (2) How to plot parametric surfaces (not all implicit functions, but some of them) using mesh with matplotlib?
#Paul's method has the advantage to be non parametric, therefore we can plot almost anything we want using contour method on each axe, it fully addresses (1). But matplotlib cannot easily build a mesh from this method, so we cannot directly get a surface from it, instead we get plane curves in all directions. This is what motivated my answer, I wanted to address (2).
Rendering mesh
If we are able to parametrize (this may be hard or impossible), with at most 2 parameters, the surface we want to plot then we can plot it with matplotlib.plot_trisurf method.
That is, from an implicit equation F(x,y,z)=0, if we are able to get a parametric system S={x=f(u,v), y=g(u,v), z=h(u,v)} then we can plot it easily with matplotlib without having to resort to contour.
Then, rendering such a 3D surface boils down to:
# Render:
ax = plt.axes(projection='3d')
ax.plot_trisurf(x, y, z, triangles=tri.triangles, cmap='jet', antialiased=True)
Where (x, y, z) are vectors (not meshgrid, see ravel) functionally computed from parameters (u, v) and triangles parameter is a Triangulation derived from (u,v) parameters to shoulder the mesh construction.
Imports
Required imports are:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
from matplotlib.tri import Triangulation
Some surfaces
Lets parametrize some surfaces...
Sphere
# Parameters:
theta = np.linspace(0, 2*np.pi, 20)
phi = np.linspace(0, np.pi, 20)
theta, phi = np.meshgrid(theta, phi)
rho = 1
# Parametrization:
x = np.ravel(rho*np.cos(theta)*np.sin(phi))
y = np.ravel(rho*np.sin(theta)*np.sin(phi))
z = np.ravel(rho*np.cos(phi))
# Triangulation:
tri = Triangulation(np.ravel(theta), np.ravel(phi))
Cone
theta = np.linspace(0, 2*np.pi, 20)
rho = np.linspace(-2, 2, 20)
theta, rho = np.meshgrid(theta, rho)
x = np.ravel(rho*np.cos(theta))
y = np.ravel(rho*np.sin(theta))
z = np.ravel(rho)
tri = Triangulation(np.ravel(theta), np.ravel(rho))
Torus
a, c = 1, 4
u = np.linspace(0, 2*np.pi, 20)
v = u.copy()
u, v = np.meshgrid(u, v)
x = np.ravel((c + a*np.cos(v))*np.cos(u))
y = np.ravel((c + a*np.cos(v))*np.sin(u))
z = np.ravel(a*np.sin(v))
tri = Triangulation(np.ravel(u), np.ravel(v))
Möbius Strip
u = np.linspace(0, 2*np.pi, 20)
v = np.linspace(-1, 1, 20)
u, v = np.meshgrid(u, v)
x = np.ravel((2 + (v/2)*np.cos(u/2))*np.cos(u))
y = np.ravel((2 + (v/2)*np.cos(u/2))*np.sin(u))
z = np.ravel(v/2*np.sin(u/2))
tri = Triangulation(np.ravel(u), np.ravel(v))
Limitation
Most of the time, Triangulation is required in order to coordinate mesh construction of plot_trisurf method, and this object only accepts two parameters, so we are limited to 2D parametric surfaces. It is unlikely we could represent the Goursat Tangle with this method.
Matplotlib expects a series of points; it will do the plotting if you can figure out how to render your equation.
Referring to Is it possible to plot implicit equations using Matplotlib? Mike Graham's answer suggests using scipy.optimize to numerically explore the implicit function.
There is an interesting gallery at http://xrt.wikidot.com/gallery:implicit showing a variety of raytraced implicit functions - if your equation matches one of these, it might give you a better idea what you are looking at.
Failing that, if you care to share the actual equation, maybe someone can suggest an easier approach.
As far as I know, it is not possible. You have to solve this equation numerically by yourself. Using scipy.optimize is a good idea. The simplest case is that you know the range of the surface that you want to plot, and just make a regular grid in x and y, and try to solve equation F(xi,yi,z)=0 for z, giving a starting point of z. Following is a very dirty code that might help you
from scipy import *
from scipy import optimize
xrange = (0,1)
yrange = (0,1)
density = 100
startz = 1
def F(x,y,z):
return x**2+y**2+z**2-10
x = linspace(xrange[0],xrange[1],density)
y = linspace(yrange[0],yrange[1],density)
points = []
for xi in x:
for yi in y:
g = lambda z:F(xi,yi,z)
res = optimize.fsolve(g, startz, full_output=1)
if res[2] == 1:
zi = res[0]
points.append([xi,yi,zi])
points = array(points)
Actually there is an easy way to plot implicit 3D surface with the scikit-image package. The key is the marching_cubes method.
import numpy as np
from skimage import measure
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
Then we compute the function over a 3D meshgrid, in this example we use the goursat_tangle method #Paul defined in its answer:
xl = np.linspace(-3, 3, 50)
X, Y, Z = np.meshgrid(xl, xl, xl)
F = goursat_tangle(X, Y, Z)
The magic is happening here with marching_cubes:
verts, faces, normals, values = measure.marching_cubes(F, 0, spacing=[np.diff(xl)[0]]*3)
verts -= 3
We just need to correct vertices coordinates as they are expressed in Voxel coordinates (hence scaling using spacing switch and the subsequent origin shift).
Finally it is just about rendering the iso-surface using tri_surface:
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_trisurf(verts[:, 0], verts[:, 1], faces, verts[:, 2], cmap='jet', lw=0)
Which returns:
Have you looked at mplot3d on matplotlib?
Finally, I did it (I updated my matplotlib to 1.0.1).
Here is code:
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
def hyp_part1(x,y,z):
return -(x**2) - (y**2) + (z**2) - 1
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x_range = np.arange(-100,100,10)
y_range = np.arange(-100,100,10)
X,Y = np.meshgrid(x_range,y_range)
A = np.linspace(-100, 100, 15)
A1,A2 = np.meshgrid(A,A)
for z in A:
X,Y = A1, A2
Z = hyp_part1(X,Y,z)
ax.contour(X, Y, Z+z, [z], zdir='z')
for y in A:
X,Z= A1, A2
Y = hyp_part1(X,y,Z)
ax.contour(X, Y+y, Z, [y], zdir='y')
for x in A:
Y,Z = A1, A2
X = hyp_part1(x,Y,Z)
ax.contour(X+x, Y, Z, [x], zdir='x')
ax.set_zlim3d(-100,100)
ax.set_xlim3d(-100,100)
ax.set_ylim3d(-100,100)
Here is result:
Thank You, Paul!
MathGL (GPL plotting library) can plot it easily. Just create a data mesh with function values f[i,j,k] and use Surf3() function to make isosurface at value f[i,j,k]=0. See this sample.
I have a set of X,Y data points (about 10k) that are easy to plot as a scatter plot but that I would like to represent as a heatmap.
I looked through the examples in Matplotlib and they all seem to already start with heatmap cell values to generate the image.
Is there a method that converts a bunch of x, y, all different, to a heatmap (where zones with higher frequency of x, y would be "warmer")?
If you don't want hexagons, you can use numpy's histogram2d function:
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
# Generate some test data
x = np.random.randn(8873)
y = np.random.randn(8873)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=50)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.clf()
plt.imshow(heatmap.T, extent=extent, origin='lower')
plt.show()
This makes a 50x50 heatmap. If you want, say, 512x384, you can put bins=(512, 384) in the call to histogram2d.
Example:
In Matplotlib lexicon, i think you want a hexbin plot.
If you're not familiar with this type of plot, it's just a bivariate histogram in which the xy-plane is tessellated by a regular grid of hexagons.
So from a histogram, you can just count the number of points falling in each hexagon, discretiize the plotting region as a set of windows, assign each point to one of these windows; finally, map the windows onto a color array, and you've got a hexbin diagram.
Though less commonly used than e.g., circles, or squares, that hexagons are a better choice for the geometry of the binning container is intuitive:
hexagons have nearest-neighbor symmetry (e.g., square bins don't,
e.g., the distance from a point on a square's border to a point
inside that square is not everywhere equal) and
hexagon is the highest n-polygon that gives regular plane
tessellation (i.e., you can safely re-model your kitchen floor with hexagonal-shaped tiles because you won't have any void space between the tiles when you are finished--not true for all other higher-n, n >= 7, polygons).
(Matplotlib uses the term hexbin plot; so do (AFAIK) all of the plotting libraries for R; still i don't know if this is the generally accepted term for plots of this type, though i suspect it's likely given that hexbin is short for hexagonal binning, which is describes the essential step in preparing the data for display.)
from matplotlib import pyplot as PLT
from matplotlib import cm as CM
from matplotlib import mlab as ML
import numpy as NP
n = 1e5
x = y = NP.linspace(-5, 5, 100)
X, Y = NP.meshgrid(x, y)
Z1 = ML.bivariate_normal(X, Y, 2, 2, 0, 0)
Z2 = ML.bivariate_normal(X, Y, 4, 1, 1, 1)
ZD = Z2 - Z1
x = X.ravel()
y = Y.ravel()
z = ZD.ravel()
gridsize=30
PLT.subplot(111)
# if 'bins=None', then color of each hexagon corresponds directly to its count
# 'C' is optional--it maps values to x-y coordinates; if 'C' is None (default) then
# the result is a pure 2D histogram
PLT.hexbin(x, y, C=z, gridsize=gridsize, cmap=CM.jet, bins=None)
PLT.axis([x.min(), x.max(), y.min(), y.max()])
cb = PLT.colorbar()
cb.set_label('mean value')
PLT.show()
Edit: For a better approximation of Alejandro's answer, see below.
I know this is an old question, but wanted to add something to Alejandro's anwser: If you want a nice smoothed image without using py-sphviewer you can instead use np.histogram2d and apply a gaussian filter (from scipy.ndimage.filters) to the heatmap:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy.ndimage.filters import gaussian_filter
def myplot(x, y, s, bins=1000):
heatmap, xedges, yedges = np.histogram2d(x, y, bins=bins)
heatmap = gaussian_filter(heatmap, sigma=s)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
return heatmap.T, extent
fig, axs = plt.subplots(2, 2)
# Generate some test data
x = np.random.randn(1000)
y = np.random.randn(1000)
sigmas = [0, 16, 32, 64]
for ax, s in zip(axs.flatten(), sigmas):
if s == 0:
ax.plot(x, y, 'k.', markersize=5)
ax.set_title("Scatter plot")
else:
img, extent = myplot(x, y, s)
ax.imshow(img, extent=extent, origin='lower', cmap=cm.jet)
ax.set_title("Smoothing with $\sigma$ = %d" % s)
plt.show()
Produces:
The scatter plot and s=16 plotted on top of eachother for Agape Gal'lo (click for better view):
One difference I noticed with my gaussian filter approach and Alejandro's approach was that his method shows local structures much better than mine. Therefore I implemented a simple nearest neighbour method at pixel level. This method calculates for each pixel the inverse sum of the distances of the n closest points in the data. This method is at a high resolution pretty computationally expensive and I think there's a quicker way, so let me know if you have any improvements.
Update: As I suspected, there's a much faster method using Scipy's scipy.cKDTree. See Gabriel's answer for the implementation.
Anyway, here's my code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
def data_coord2view_coord(p, vlen, pmin, pmax):
dp = pmax - pmin
dv = (p - pmin) / dp * vlen
return dv
def nearest_neighbours(xs, ys, reso, n_neighbours):
im = np.zeros([reso, reso])
extent = [np.min(xs), np.max(xs), np.min(ys), np.max(ys)]
xv = data_coord2view_coord(xs, reso, extent[0], extent[1])
yv = data_coord2view_coord(ys, reso, extent[2], extent[3])
for x in range(reso):
for y in range(reso):
xp = (xv - x)
yp = (yv - y)
d = np.sqrt(xp**2 + yp**2)
im[y][x] = 1 / np.sum(d[np.argpartition(d.ravel(), n_neighbours)[:n_neighbours]])
return im, extent
n = 1000
xs = np.random.randn(n)
ys = np.random.randn(n)
resolution = 250
fig, axes = plt.subplots(2, 2)
for ax, neighbours in zip(axes.flatten(), [0, 16, 32, 64]):
if neighbours == 0:
ax.plot(xs, ys, 'k.', markersize=2)
ax.set_aspect('equal')
ax.set_title("Scatter Plot")
else:
im, extent = nearest_neighbours(xs, ys, resolution, neighbours)
ax.imshow(im, origin='lower', extent=extent, cmap=cm.jet)
ax.set_title("Smoothing over %d neighbours" % neighbours)
ax.set_xlim(extent[0], extent[1])
ax.set_ylim(extent[2], extent[3])
plt.show()
Result:
Instead of using np.hist2d, which in general produces quite ugly histograms, I would like to recycle py-sphviewer, a python package for rendering particle simulations using an adaptive smoothing kernel and that can be easily installed from pip (see webpage documentation). Consider the following code, which is based on the example:
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
import sphviewer as sph
def myplot(x, y, nb=32, xsize=500, ysize=500):
xmin = np.min(x)
xmax = np.max(x)
ymin = np.min(y)
ymax = np.max(y)
x0 = (xmin+xmax)/2.
y0 = (ymin+ymax)/2.
pos = np.zeros([len(x),3])
pos[:,0] = x
pos[:,1] = y
w = np.ones(len(x))
P = sph.Particles(pos, w, nb=nb)
S = sph.Scene(P)
S.update_camera(r='infinity', x=x0, y=y0, z=0,
xsize=xsize, ysize=ysize)
R = sph.Render(S)
R.set_logscale()
img = R.get_image()
extent = R.get_extent()
for i, j in zip(xrange(4), [x0,x0,y0,y0]):
extent[i] += j
print extent
return img, extent
fig = plt.figure(1, figsize=(10,10))
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
# Generate some test data
x = np.random.randn(1000)
y = np.random.randn(1000)
#Plotting a regular scatter plot
ax1.plot(x,y,'k.', markersize=5)
ax1.set_xlim(-3,3)
ax1.set_ylim(-3,3)
heatmap_16, extent_16 = myplot(x,y, nb=16)
heatmap_32, extent_32 = myplot(x,y, nb=32)
heatmap_64, extent_64 = myplot(x,y, nb=64)
ax2.imshow(heatmap_16, extent=extent_16, origin='lower', aspect='auto')
ax2.set_title("Smoothing over 16 neighbors")
ax3.imshow(heatmap_32, extent=extent_32, origin='lower', aspect='auto')
ax3.set_title("Smoothing over 32 neighbors")
#Make the heatmap using a smoothing over 64 neighbors
ax4.imshow(heatmap_64, extent=extent_64, origin='lower', aspect='auto')
ax4.set_title("Smoothing over 64 neighbors")
plt.show()
which produces the following image:
As you see, the images look pretty nice, and we are able to identify different substructures on it. These images are constructed spreading a given weight for every point within a certain domain, defined by the smoothing length, which in turns is given by the distance to the closer nb neighbor (I've chosen 16, 32 and 64 for the examples). So, higher density regions typically are spread over smaller regions compared to lower density regions.
The function myplot is just a very simple function that I've written in order to give the x,y data to py-sphviewer to do the magic.
If you are using 1.2.x
import numpy as np
import matplotlib.pyplot as plt
x = np.random.randn(100000)
y = np.random.randn(100000)
plt.hist2d(x,y,bins=100)
plt.show()
Seaborn now has the jointplot function which should work nicely here:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# Generate some test data
x = np.random.randn(8873)
y = np.random.randn(8873)
sns.jointplot(x=x, y=y, kind='hex')
plt.show()
Here's Jurgy's great nearest neighbour approach but implemented using scipy.cKDTree. In my tests it's about 100x faster.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy.spatial import cKDTree
def data_coord2view_coord(p, resolution, pmin, pmax):
dp = pmax - pmin
dv = (p - pmin) / dp * resolution
return dv
n = 1000
xs = np.random.randn(n)
ys = np.random.randn(n)
resolution = 250
extent = [np.min(xs), np.max(xs), np.min(ys), np.max(ys)]
xv = data_coord2view_coord(xs, resolution, extent[0], extent[1])
yv = data_coord2view_coord(ys, resolution, extent[2], extent[3])
def kNN2DDens(xv, yv, resolution, neighbours, dim=2):
"""
"""
# Create the tree
tree = cKDTree(np.array([xv, yv]).T)
# Find the closest nnmax-1 neighbors (first entry is the point itself)
grid = np.mgrid[0:resolution, 0:resolution].T.reshape(resolution**2, dim)
dists = tree.query(grid, neighbours)
# Inverse of the sum of distances to each grid point.
inv_sum_dists = 1. / dists[0].sum(1)
# Reshape
im = inv_sum_dists.reshape(resolution, resolution)
return im
fig, axes = plt.subplots(2, 2, figsize=(15, 15))
for ax, neighbours in zip(axes.flatten(), [0, 16, 32, 63]):
if neighbours == 0:
ax.plot(xs, ys, 'k.', markersize=5)
ax.set_aspect('equal')
ax.set_title("Scatter Plot")
else:
im = kNN2DDens(xv, yv, resolution, neighbours)
ax.imshow(im, origin='lower', extent=extent, cmap=cm.Blues)
ax.set_title("Smoothing over %d neighbours" % neighbours)
ax.set_xlim(extent[0], extent[1])
ax.set_ylim(extent[2], extent[3])
plt.savefig('new.png', dpi=150, bbox_inches='tight')
and the initial question was... how to convert scatter values to grid values, right?
histogram2d does count the frequency per cell, however, if you have other data per cell than just the frequency, you'd need some additional work to do.
x = data_x # between -10 and 4, log-gamma of an svc
y = data_y # between -4 and 11, log-C of an svc
z = data_z #between 0 and 0.78, f1-values from a difficult dataset
So, I have a dataset with Z-results for X and Y coordinates. However, I was calculating few points outside the area of interest (large gaps), and heaps of points in a small area of interest.
Yes here it becomes more difficult but also more fun. Some libraries (sorry):
from matplotlib import pyplot as plt
from matplotlib import cm
import numpy as np
from scipy.interpolate import griddata
pyplot is my graphic engine today,
cm is a range of color maps with some initeresting choice.
numpy for the calculations,
and griddata for attaching values to a fixed grid.
The last one is important especially because the frequency of xy points is not equally distributed in my data. First, let's start with some boundaries fitting to my data and an arbitrary grid size. The original data has datapoints also outside those x and y boundaries.
#determine grid boundaries
gridsize = 500
x_min = -8
x_max = 2.5
y_min = -2
y_max = 7
So we have defined a grid with 500 pixels between the min and max values of x and y.
In my data, there are lots more than the 500 values available in the area of high interest; whereas in the low-interest-area, there are not even 200 values in the total grid; between the graphic boundaries of x_min and x_max there are even less.
So for getting a nice picture, the task is to get an average for the high interest values and to fill the gaps elsewhere.
I define my grid now. For each xx-yy pair, i want to have a color.
xx = np.linspace(x_min, x_max, gridsize) # array of x values
yy = np.linspace(y_min, y_max, gridsize) # array of y values
grid = np.array(np.meshgrid(xx, yy.T))
grid = grid.reshape(2, grid.shape[1]*grid.shape[2]).T
Why the strange shape? scipy.griddata wants a shape of (n, D).
Griddata calculates one value per point in the grid, by a predefined method.
I choose "nearest" - empty grid points will be filled with values from the nearest neighbor. This looks as if the areas with less information have bigger cells (even if it is not the case). One could choose to interpolate "linear", then areas with less information look less sharp. Matter of taste, really.
points = np.array([x, y]).T # because griddata wants it that way
z_grid2 = griddata(points, z, grid, method='nearest')
# you get a 1D vector as result. Reshape to picture format!
z_grid2 = z_grid2.reshape(xx.shape[0], yy.shape[0])
And hop, we hand over to matplotlib to display the plot
fig = plt.figure(1, figsize=(10, 10))
ax1 = fig.add_subplot(111)
ax1.imshow(z_grid2, extent=[x_min, x_max,y_min, y_max, ],
origin='lower', cmap=cm.magma)
ax1.set_title("SVC: empty spots filled by nearest neighbours")
ax1.set_xlabel('log gamma')
ax1.set_ylabel('log C')
plt.show()
Around the pointy part of the V-Shape, you see I did a lot of calculations during my search for the sweet spot, whereas the less interesting parts almost everywhere else have a lower resolution.
Make a 2-dimensional array that corresponds to the cells in your final image, called say heatmap_cells and instantiate it as all zeroes.
Choose two scaling factors that define the difference between each array element in real units, for each dimension, say x_scale and y_scale. Choose these such that all your datapoints will fall within the bounds of the heatmap array.
For each raw datapoint with x_value and y_value:
heatmap_cells[floor(x_value/x_scale),floor(y_value/y_scale)]+=1
Very similar to #Piti's answer, but using 1 call instead of 2 to generate the points:
import numpy as np
import matplotlib.pyplot as plt
pts = 1000000
mean = [0.0, 0.0]
cov = [[1.0,0.0],[0.0,1.0]]
x,y = np.random.multivariate_normal(mean, cov, pts).T
plt.hist2d(x, y, bins=50, cmap=plt.cm.jet)
plt.show()
Output:
Here's one I made on a 1 Million point set with 3 categories (colored Red, Green, and Blue). Here's a link to the repository if you'd like to try the function. Github Repo
histplot(
X,
Y,
labels,
bins=2000,
range=((-3,3),(-3,3)),
normalize_each_label=True,
colors = [
[1,0,0],
[0,1,0],
[0,0,1]],
gain=50)
I'm afraid I'm a little late to the party but I had a similar question a while ago. The accepted answer (by #ptomato) helped me out but I'd also want to post this in case it's of use to someone.
''' I wanted to create a heatmap resembling a football pitch which would show the different actions performed '''
import numpy as np
import matplotlib.pyplot as plt
import random
#fixing random state for reproducibility
np.random.seed(1234324)
fig = plt.figure(12)
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
#Ratio of the pitch with respect to UEFA standards
hmap= np.full((6, 10), 0)
#print(hmap)
xlist = np.random.uniform(low=0.0, high=100.0, size=(20))
ylist = np.random.uniform(low=0.0, high =100.0, size =(20))
#UEFA Pitch Standards are 105m x 68m
xlist = (xlist/100)*10.5
ylist = (ylist/100)*6.5
ax1.scatter(xlist,ylist)
#int of the co-ordinates to populate the array
xlist_int = xlist.astype (int)
ylist_int = ylist.astype (int)
#print(xlist_int, ylist_int)
for i, j in zip(xlist_int, ylist_int):
#this populates the array according to the x,y co-ordinate values it encounters
hmap[j][i]= hmap[j][i] + 1
#Reversing the rows is necessary
hmap = hmap[::-1]
#print(hmap)
im = ax2.imshow(hmap)
Here's the result
None of these solutions worked for my application, so this is what I came up with. Essentially I am placing a 2D Gaussian at every single point:
import cv2
import numpy as np
import matplotlib.pyplot as plt
def getGaussian2D(ksize, sigma, norm=True):
oneD = cv2.getGaussianKernel(ksize=ksize, sigma=sigma)
twoD = np.outer(oneD.T, oneD)
return twoD / np.sum(twoD) if norm else twoD
def pt2heat(pts, shape, kernel=16, sigma=5):
heat = np.zeros(shape)
k = getGaussian2D(kernel, sigma)
for y,x in pts:
x, y = int(x), int(y)
for i in range(-kernel//2, kernel//2):
for j in range(-kernel//2, kernel//2):
if 0 <= x+i < shape[0] and 0 <= y+j < shape[1]:
heat[x+i, y+j] = heat[x+i, y+j] + k[i+kernel//2, j+kernel//2]
return heat
heat = pts2heat(pts, img.shape[:2])
plt.imshow(heat, cmap='heat')
Here are the points overlayed ontop of it's associated image, along with the resulting heat map: