Continuous rather than discrete intensity gradient describing function in python - python

I'm trying to plot a function using an intensity/colour scale and it turns out discrete rather than being a continuous of colour where white (for example) is max. intensity and black is 0. It doesn't seem to be affected by the number of points in the 'np.linspace' which is what I'm a bit confused about.
x = y = np.linspace(0, 4*np.pi, 2000)
def cos(x, y):
return np.cos(x)**2
def squared(x, y):
return x**2
X, Y = np.meshgrid(x, y)
Z = cos(X, Y)
plt.contourf(Z, cmap = 'Greys')
Z = squared(X, Y)
plt.contourf(Z, cmap = 'Greys')

plt.contourf is supposed to be discrete as it shows - so you can see the contours. One option that you have for that scenario is the following:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
x = y = np.linspace(0, 4*np.pi, 2000)
def cos(x, y):
return np.cos(x)**2
def squared(x, y):
return x**2
X, Y = np.meshgrid(x, y)
Z = cos(X, Y)
plt.imshow(Z, vmin = 0., vmax = 1., cmap=plt.cm.gray_r) # gray_r to reverse color and make it as you show in your images
plt.show()

Here you are plotting a filled contour plot. A contour plot mostly makes sense if you want to show discrete contours. E.g. weather maps often show isobars in that style or geographic maps show lines of equal terrain height via "contours".
By default, matplotlib chooses a number of ~8 contours, but the exact number may vary depending on the data scale.
You can choose the number of levels (approximately) via the levels argument. So increasing that number will show you a more continuous gradient.
plt.contourf(Z, levels = 121, cmap = 'Greys')
In general, however, if a continuous image is desired, one would rather plot an image indeed,
dx = dy = np.diff(x)[0]
extent = x.min()-dx/2, x.max()+dx/2, y.min()-dx/2, y.max()+dx/2
plt.imshow(Z, cmap = 'Greys', extent=extent, aspect="auto")
You may notice how there is almost no visual difference between the two, yet the imshow approach is much (much, much) faster, because no contouring algorithm needs to be used.

Related

Partially discrete colormap matplotlib

While there are a number of examples of discrete colormap (a,b,c), I would like to do something a little different. I want to have a 3D surface plot that has a sharp contrast between a small value and zero, so the colors 'jump' or the colormap is partially discrete. My reason for this is that I want to more clearly distinguish between small values and what is consider to be 'zero' within a plot.
I am generating a 3D surface plot and want to use a colormap (like 'terrain') to indicate height on the Z-axis. However, I want there to be a 'gap' in the colormap to highlight values that are sufficiently far from z=0. Specifically, let's say z<1e-6 is the bottom threshold of the colormap (e.g., dark blue for terrain), but any value above that threshold to be in the middle of the colormap (e.g. green for terrain).
Below is a simple example and the corresponding output
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt
y = np.linspace(-3, 3, 100)
x = np.linspace(-3, 3, 100)
z = np.zeros(shape=(x.shape[0], y.shape[0]))
for i in range(x.shape[0]):
# creating some generic Z-axis data
z[:, i] = norm.pdf(y, loc=0, scale=0.2+(i/100))
z[:, i] = z[:, i] / np.sum(z[:, i]) # normalizing
z = np.where(z < 1e-6, 0, z) # setting 'small enough' threshold
x_mat, y_mat = np.meshgrid(x, y)
f1 = plt.axes(projection='3d')
f1.plot_surface(x_mat, y_mat, z, cmap='terrain', edgecolor='none', rstride=1)
plt.show()
Here is what the output from above:
What I want the output to look like would be all the 'light blue' regions would instead be green. Once below the defined threshold (1e-6 here), the color would jump to dark blue (so no regions would be light blue).
Alright, I figured out a solution to my own problem. I adapted the solution from HERE to address my issue. Below is the code to accomplish this.
Setup:
import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt
from matplotlib.cm import get_cmap
y = np.linspace(-3, 3, 100)
x = np.linspace(-3, 3, 100)
z = np.zeros(shape=(x.shape[0], y.shape[0]))
x_mat, y_mat = np.meshgrid(x, y)
# Threshold to apply
z_threshold = 1e-6
for i in range(x.shape[0]):
z[:, i] = norm.pdf(y, loc=0, scale=0.2+(i/100))
z[:, i] = z[:, i] / np.sum(z[:, i]) # normalizing
Next I define two different colormaps. The first color map applies to all values above the threshold. If values are below the threshold, it sets that square as transparent.
cmap = get_cmap('terrain')
# 1.8 and 0.2 are used to restrict the upper and lower parts of the colormap
colors = cmap((z - z_threshold) / ((np.max(z)*1.8) - (np.min(z))) + 0.2)
# if below threshold, set as transparent (alpha=0)
colors[z < z_threshold, -1] = 0
The second colormap defines the color for all places below the threshold. This step isn't fully necessary, but it does prevent the plane from being drawn below the rest of the plot.
colors2 = cmap(z)
colors2[z >= z_threshold, -1] = 0
Now the colormaps can be used in two 3D plot calls
# init 3D plot
f1 = plt.axes(projection='3d')
# Plot values above the threshold
f1.plot_surface(x_mat, y_mat, z, facecolors=colors, edgecolor='none', rstride=1)
# Plot values below the threshold
z_below = np.zeros(shape=(x.shape[0], y.shape[0]))
f1.plot_surface(x_mat, y_mat, z_below,
facecolors=colors2, edgecolor='none', rstride=1, vmin=0)
# Setting the zlimits
f1.set_zlim([0, np.max(z)])
plt.show()
The above results in the following plot

Graphing a Failure Mode Plot

I am trying to create a code to produce a failure mode plot for honeycomb beam structures as seen in the image below:
(from M. Sadighi et al. 2009)
There are distinct formulas to calculate the load at which the failure will occur, and given the material/beam parameters, the lowest value is the most likely failure to occur.
I have a triple nested for loop running ranges of values for t, L, and rho. Mesh grid from numpy and the contour plot from matplotlib seemed logical, but throw an error for the z input requiring a 2D array.
I thought that maybe each failure type could be encoded into a value (i.e. 1 for core crush, 2 for indentation, etc.) and you could scan the x and y values to store where the change in failure type happen, but I still don't know how to get that into a plot.
The closest thing I have found so far can be seen here The moons dataset and decision surface graphics in a Jupyter environment where the plot is split into different colored regions.
How can this be plotted?
P.S. I know that you are suppose to attach code, however in this case, a ton of variables are need to do the calculations and would be difficult to pass around.
I am trying to rephrase the question. Let's say we have functions of two coordinates: f1(x, y), f2(x, y)... They correspond for instance at the critical stress for each failure mode. Variables x, y are two of the parameters.
For a given (x, y) the failure mode is obtain by using argmin( f1(x, y), f2(x, y), ... ) i.e. the failure mode for which the critical stress is minimal
Here is the simple solution I could think of to obtain a map of the failure modes:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
# Mesh the parameters space
x = np.linspace(0, 2, 35)
y = np.linspace(0, 1, 24)
x_grid, y_grid = np.meshgrid(x, y)
# Compute the functions for each point on the mesh
f1 = x_grid + y_grid
f2 = 0.5 + x_grid**2
f3 = 1 + y_grid**2
# Identify which function is minimal for each point
failure_mode = np.argmin([f1, f2, f3], axis=0)
# Graph
discrete_colormap = ListedColormap(['gold', 'darkorange', 'mediumseagreen'])
plt.pcolormesh(x, y, failure_mode, cmap=discrete_colormap);
cbar = plt.colorbar();
cbar.set_label('failure mode');
cbar.set_ticks(np.arange(np.max(failure_mode)+1));
plt.xlabel('x'); plt.ylabel('y');
which gives:
See for example this answer for discrete colormap.
Here is a solution to plot the contour of each zone:
The contour of the zone i is defined as the points (x, y) such that f_i(x, y) is equal to the minimum of all the remaining functions i.e. min( f_j(x, y) for i != j ). We could use several contour plots with surfaces equal to f_i(x, y) - min( f_j(x, y) for i!=j ). The level of the zone boundary is zero.
import numpy as np
import matplotlib.pyplot as plt
# Mesh the parameters space
x = np.linspace(0, 2, 35)
y = np.linspace(0, 1, 24)
x_grid, y_grid = np.meshgrid(x, y)
# List of functions evaluated for each point of the mesh
f_grid = [x_grid + y_grid,
0.5 + x_grid**2,
1 + y_grid**2]
# Identify which function is minimal for each point
failure_mode = np.argmin(f_grid, axis=0)
# this part is for the background
critical_stress = np.min(f_grid, axis=0)
plt.pcolormesh(x, y, critical_stress, shading='Gouraud')
cbar = plt.colorbar();
cbar.set_label('critical stress');
# Plot the contour of each zone
for i in range(len(f_grid)):
other_functions = [f_j for j, f_j in enumerate(f_grid) if i != j]
level_surface = f_grid[i] - np.min(other_functions, axis=0)
plt.contour(x, y, level_surface,
levels=[0, ],
linewidths=2, colors='black');
# label
barycentre_x = np.mean(x_grid[failure_mode==i])
barycentre_y = np.mean(y_grid[failure_mode==i])
plt.text(barycentre_x, barycentre_y, 'mode %i' % i)
plt.xlabel('x'); plt.ylabel('y');
the graph is:

contour deformation in python

I have a contour map and I want to make a deformation of all the contour lines, where the contour of level 0.5 will be deformed around the blue point situated in his line and then pass on the blue point on the contour of level 1, and so on.
Original map :
Deformed map :
I think there are two steps, the first one is the delete some parts of the map and the second is to redraw the contour map.
I think i have to iterate through the contour map like this:
CS = plt.contour(X, Y, Z)
for level in CS.collections:
for kp, path in list(enumerate(level.get_paths())):
But I have no idea how to use kp and path
Any tips for doing this would be appreciated!
Here is an example on how you could change the contour plot to achieve the intended deformation.
It generates some data x,y,z which should later be modified. Then it specifies a deformation function, which when multiplied to z deforms the data in the desired way. This deformation function takes the x and y data as input as well as the angle of the line along which to perform the deformation and a width (spread) of the deformation. Finally a parameter i is used for stearing the degree of deformation (i.e. i=0 means no deformation). Of course you can use any other function to deform your contour.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.animation
#### generate some x,y,z data ####
r = np.linspace(0,6, num=100)
phi = np.linspace(0, 2*np.pi, num=200)
R, Phi = np.meshgrid(r,phi)
x = R*np.cos(Phi)
y = R*np.sin(Phi)
z = R
##################################
fig, ax=plt.subplots()
ax.set_aspect("equal")
def update(i):
ax.clear()
f = lambda x,y, offs, width, i: 1-i*np.exp(-(np.arctan2(x,y)-offs)**2/width)
z_deformed = z*f(x,y, np.pi/4, 1./8., i=i)
ax.contour(x,y,z_deformed, 10, linewidths=4)
ax.contourf(x,y,z_deformed, 10, alpha=0.3)
ax.set_xlim([-4,4])
ax.set_ylim([-4,4])
update(0) #plot the original data
anipath = 0.5*np.sin(np.linspace(0, np.pi, num=20))**2
ani = matplotlib.animation.FuncAnimation(fig, update, frames=anipath, interval = 100)
plt.show()
Of course you can use other shapes of deformation. E.g. to get a triangular shape use
f = lambda x, A, a, b: A*(1.-np.abs((x-b)/a))*(np.abs((x-b)) < a )
z_deformed = z - f(np.arctan2(x,y), i, 1./8., np.pi/4 )

Matplotlib heatmap with changing y-values

I'm trying to plot some data for a measurement taken from between two surfaces. The z-direction in the system is defined as normal to the surfaces. The problem is that along the x-axis of my plot I'm varying the separation distance between the two surfaces which means that for every slice, the min/max of the y-axis change. I've sort circumvented this by presenting a normalized y-axis where z_min is the bottom surface and z_max is the top surface:
However, this representation somewhat distorts the data. Ideally I would like to show the actual distance to the wall on the y-axis and just leave the areas outside of the system bounds white. I (poorly) sketched what I'm envisioning here (the actual distribution on the heatmap should look different, of course):
I can pretty easily plot what I want as a 3D scatter plot like so:
But how do I get the data into a plot-able form for a heatmap?
I'm guessing I would have to blow up the MxN array and fill in missing values through interpolation or simply mark them as NAN? But then I'm also not quite sure how to add a hard cutoff to my color scheme to make everything outside of the system white.
You can do this with pcolormesh which takes the corners of quadrilaterals as the arguements
X, Y = np.meshgrid(np.linspace(0, 10, 100), np.linspace(0, 2*np.pi, 150),)
h = np.sin(Y)
Y *= np.linspace(.5, 1, 100)
fig, ax = plt.subplots(1, 1)
ax.pcolormesh(X, Y, h)
Below an implementation with triangular mesh contouring, based on CT Zhu example.
If your domain is not convex, you will need to provide your own triangles to the triangulation, as default Delaunay triangulation meshes the convex hull from your points.
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.tri as mtri
y = np.array([np.linspace(-i, i, 51) for i in (
np.linspace(5, 10))[::-1]])
x = (np.zeros((50, 51)) +
np.linspace(1, 6, 50)[..., np.newaxis])
z = (np.zeros((50, 51)) -
np.linspace(-5, 5, 51)**2 + 10) # make up some z data
x = x.flatten()
y = y.flatten()
z = z.flatten()
print "x shape: ", x.shape
triang = mtri.Triangulation(x, y)
plt.tricontourf(triang, z)
plt.colorbar()
plt.show()
I guess, maybe 2d interpolation by using griddata will be what you want?
from matplotlib.mlab import griddata
xi=linspace(1,5,100)
yi=linspace(-10.5, 10.5, 100)
y=array([linspace(-i, i, 51) for i in (linspace(5,10))[::-1]]) #make up some y vectors with different range
x=zeros((50,51))+linspace(1,6, 50)[...,newaxis]
z=zeros((50,51))-linspace(-5, 5,51)**2+10 #make up some z data
x=x.flatten()
y=y.flatten()
z=z.flatten()
zi=griddata(x, y, z, xi, yi)
plt.contourf(xi, yi, zi, levels=-linspace(-5, 5,51)**2+10)

Generate a heatmap using a scatter data set

I have a set of X,Y data points (about 10k) that are easy to plot as a scatter plot but that I would like to represent as a heatmap.
I looked through the examples in Matplotlib and they all seem to already start with heatmap cell values to generate the image.
Is there a method that converts a bunch of x, y, all different, to a heatmap (where zones with higher frequency of x, y would be "warmer")?
If you don't want hexagons, you can use numpy's histogram2d function:
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
# Generate some test data
x = np.random.randn(8873)
y = np.random.randn(8873)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=50)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.clf()
plt.imshow(heatmap.T, extent=extent, origin='lower')
plt.show()
This makes a 50x50 heatmap. If you want, say, 512x384, you can put bins=(512, 384) in the call to histogram2d.
Example:
In Matplotlib lexicon, i think you want a hexbin plot.
If you're not familiar with this type of plot, it's just a bivariate histogram in which the xy-plane is tessellated by a regular grid of hexagons.
So from a histogram, you can just count the number of points falling in each hexagon, discretiize the plotting region as a set of windows, assign each point to one of these windows; finally, map the windows onto a color array, and you've got a hexbin diagram.
Though less commonly used than e.g., circles, or squares, that hexagons are a better choice for the geometry of the binning container is intuitive:
hexagons have nearest-neighbor symmetry (e.g., square bins don't,
e.g., the distance from a point on a square's border to a point
inside that square is not everywhere equal) and
hexagon is the highest n-polygon that gives regular plane
tessellation (i.e., you can safely re-model your kitchen floor with hexagonal-shaped tiles because you won't have any void space between the tiles when you are finished--not true for all other higher-n, n >= 7, polygons).
(Matplotlib uses the term hexbin plot; so do (AFAIK) all of the plotting libraries for R; still i don't know if this is the generally accepted term for plots of this type, though i suspect it's likely given that hexbin is short for hexagonal binning, which is describes the essential step in preparing the data for display.)
from matplotlib import pyplot as PLT
from matplotlib import cm as CM
from matplotlib import mlab as ML
import numpy as NP
n = 1e5
x = y = NP.linspace(-5, 5, 100)
X, Y = NP.meshgrid(x, y)
Z1 = ML.bivariate_normal(X, Y, 2, 2, 0, 0)
Z2 = ML.bivariate_normal(X, Y, 4, 1, 1, 1)
ZD = Z2 - Z1
x = X.ravel()
y = Y.ravel()
z = ZD.ravel()
gridsize=30
PLT.subplot(111)
# if 'bins=None', then color of each hexagon corresponds directly to its count
# 'C' is optional--it maps values to x-y coordinates; if 'C' is None (default) then
# the result is a pure 2D histogram
PLT.hexbin(x, y, C=z, gridsize=gridsize, cmap=CM.jet, bins=None)
PLT.axis([x.min(), x.max(), y.min(), y.max()])
cb = PLT.colorbar()
cb.set_label('mean value')
PLT.show()
Edit: For a better approximation of Alejandro's answer, see below.
I know this is an old question, but wanted to add something to Alejandro's anwser: If you want a nice smoothed image without using py-sphviewer you can instead use np.histogram2d and apply a gaussian filter (from scipy.ndimage.filters) to the heatmap:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy.ndimage.filters import gaussian_filter
def myplot(x, y, s, bins=1000):
heatmap, xedges, yedges = np.histogram2d(x, y, bins=bins)
heatmap = gaussian_filter(heatmap, sigma=s)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
return heatmap.T, extent
fig, axs = plt.subplots(2, 2)
# Generate some test data
x = np.random.randn(1000)
y = np.random.randn(1000)
sigmas = [0, 16, 32, 64]
for ax, s in zip(axs.flatten(), sigmas):
if s == 0:
ax.plot(x, y, 'k.', markersize=5)
ax.set_title("Scatter plot")
else:
img, extent = myplot(x, y, s)
ax.imshow(img, extent=extent, origin='lower', cmap=cm.jet)
ax.set_title("Smoothing with $\sigma$ = %d" % s)
plt.show()
Produces:
The scatter plot and s=16 plotted on top of eachother for Agape Gal'lo (click for better view):
One difference I noticed with my gaussian filter approach and Alejandro's approach was that his method shows local structures much better than mine. Therefore I implemented a simple nearest neighbour method at pixel level. This method calculates for each pixel the inverse sum of the distances of the n closest points in the data. This method is at a high resolution pretty computationally expensive and I think there's a quicker way, so let me know if you have any improvements.
Update: As I suspected, there's a much faster method using Scipy's scipy.cKDTree. See Gabriel's answer for the implementation.
Anyway, here's my code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
def data_coord2view_coord(p, vlen, pmin, pmax):
dp = pmax - pmin
dv = (p - pmin) / dp * vlen
return dv
def nearest_neighbours(xs, ys, reso, n_neighbours):
im = np.zeros([reso, reso])
extent = [np.min(xs), np.max(xs), np.min(ys), np.max(ys)]
xv = data_coord2view_coord(xs, reso, extent[0], extent[1])
yv = data_coord2view_coord(ys, reso, extent[2], extent[3])
for x in range(reso):
for y in range(reso):
xp = (xv - x)
yp = (yv - y)
d = np.sqrt(xp**2 + yp**2)
im[y][x] = 1 / np.sum(d[np.argpartition(d.ravel(), n_neighbours)[:n_neighbours]])
return im, extent
n = 1000
xs = np.random.randn(n)
ys = np.random.randn(n)
resolution = 250
fig, axes = plt.subplots(2, 2)
for ax, neighbours in zip(axes.flatten(), [0, 16, 32, 64]):
if neighbours == 0:
ax.plot(xs, ys, 'k.', markersize=2)
ax.set_aspect('equal')
ax.set_title("Scatter Plot")
else:
im, extent = nearest_neighbours(xs, ys, resolution, neighbours)
ax.imshow(im, origin='lower', extent=extent, cmap=cm.jet)
ax.set_title("Smoothing over %d neighbours" % neighbours)
ax.set_xlim(extent[0], extent[1])
ax.set_ylim(extent[2], extent[3])
plt.show()
Result:
Instead of using np.hist2d, which in general produces quite ugly histograms, I would like to recycle py-sphviewer, a python package for rendering particle simulations using an adaptive smoothing kernel and that can be easily installed from pip (see webpage documentation). Consider the following code, which is based on the example:
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
import sphviewer as sph
def myplot(x, y, nb=32, xsize=500, ysize=500):
xmin = np.min(x)
xmax = np.max(x)
ymin = np.min(y)
ymax = np.max(y)
x0 = (xmin+xmax)/2.
y0 = (ymin+ymax)/2.
pos = np.zeros([len(x),3])
pos[:,0] = x
pos[:,1] = y
w = np.ones(len(x))
P = sph.Particles(pos, w, nb=nb)
S = sph.Scene(P)
S.update_camera(r='infinity', x=x0, y=y0, z=0,
xsize=xsize, ysize=ysize)
R = sph.Render(S)
R.set_logscale()
img = R.get_image()
extent = R.get_extent()
for i, j in zip(xrange(4), [x0,x0,y0,y0]):
extent[i] += j
print extent
return img, extent
fig = plt.figure(1, figsize=(10,10))
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
# Generate some test data
x = np.random.randn(1000)
y = np.random.randn(1000)
#Plotting a regular scatter plot
ax1.plot(x,y,'k.', markersize=5)
ax1.set_xlim(-3,3)
ax1.set_ylim(-3,3)
heatmap_16, extent_16 = myplot(x,y, nb=16)
heatmap_32, extent_32 = myplot(x,y, nb=32)
heatmap_64, extent_64 = myplot(x,y, nb=64)
ax2.imshow(heatmap_16, extent=extent_16, origin='lower', aspect='auto')
ax2.set_title("Smoothing over 16 neighbors")
ax3.imshow(heatmap_32, extent=extent_32, origin='lower', aspect='auto')
ax3.set_title("Smoothing over 32 neighbors")
#Make the heatmap using a smoothing over 64 neighbors
ax4.imshow(heatmap_64, extent=extent_64, origin='lower', aspect='auto')
ax4.set_title("Smoothing over 64 neighbors")
plt.show()
which produces the following image:
As you see, the images look pretty nice, and we are able to identify different substructures on it. These images are constructed spreading a given weight for every point within a certain domain, defined by the smoothing length, which in turns is given by the distance to the closer nb neighbor (I've chosen 16, 32 and 64 for the examples). So, higher density regions typically are spread over smaller regions compared to lower density regions.
The function myplot is just a very simple function that I've written in order to give the x,y data to py-sphviewer to do the magic.
If you are using 1.2.x
import numpy as np
import matplotlib.pyplot as plt
x = np.random.randn(100000)
y = np.random.randn(100000)
plt.hist2d(x,y,bins=100)
plt.show()
Seaborn now has the jointplot function which should work nicely here:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# Generate some test data
x = np.random.randn(8873)
y = np.random.randn(8873)
sns.jointplot(x=x, y=y, kind='hex')
plt.show()
Here's Jurgy's great nearest neighbour approach but implemented using scipy.cKDTree. In my tests it's about 100x faster.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy.spatial import cKDTree
def data_coord2view_coord(p, resolution, pmin, pmax):
dp = pmax - pmin
dv = (p - pmin) / dp * resolution
return dv
n = 1000
xs = np.random.randn(n)
ys = np.random.randn(n)
resolution = 250
extent = [np.min(xs), np.max(xs), np.min(ys), np.max(ys)]
xv = data_coord2view_coord(xs, resolution, extent[0], extent[1])
yv = data_coord2view_coord(ys, resolution, extent[2], extent[3])
def kNN2DDens(xv, yv, resolution, neighbours, dim=2):
"""
"""
# Create the tree
tree = cKDTree(np.array([xv, yv]).T)
# Find the closest nnmax-1 neighbors (first entry is the point itself)
grid = np.mgrid[0:resolution, 0:resolution].T.reshape(resolution**2, dim)
dists = tree.query(grid, neighbours)
# Inverse of the sum of distances to each grid point.
inv_sum_dists = 1. / dists[0].sum(1)
# Reshape
im = inv_sum_dists.reshape(resolution, resolution)
return im
fig, axes = plt.subplots(2, 2, figsize=(15, 15))
for ax, neighbours in zip(axes.flatten(), [0, 16, 32, 63]):
if neighbours == 0:
ax.plot(xs, ys, 'k.', markersize=5)
ax.set_aspect('equal')
ax.set_title("Scatter Plot")
else:
im = kNN2DDens(xv, yv, resolution, neighbours)
ax.imshow(im, origin='lower', extent=extent, cmap=cm.Blues)
ax.set_title("Smoothing over %d neighbours" % neighbours)
ax.set_xlim(extent[0], extent[1])
ax.set_ylim(extent[2], extent[3])
plt.savefig('new.png', dpi=150, bbox_inches='tight')
and the initial question was... how to convert scatter values to grid values, right?
histogram2d does count the frequency per cell, however, if you have other data per cell than just the frequency, you'd need some additional work to do.
x = data_x # between -10 and 4, log-gamma of an svc
y = data_y # between -4 and 11, log-C of an svc
z = data_z #between 0 and 0.78, f1-values from a difficult dataset
So, I have a dataset with Z-results for X and Y coordinates. However, I was calculating few points outside the area of interest (large gaps), and heaps of points in a small area of interest.
Yes here it becomes more difficult but also more fun. Some libraries (sorry):
from matplotlib import pyplot as plt
from matplotlib import cm
import numpy as np
from scipy.interpolate import griddata
pyplot is my graphic engine today,
cm is a range of color maps with some initeresting choice.
numpy for the calculations,
and griddata for attaching values to a fixed grid.
The last one is important especially because the frequency of xy points is not equally distributed in my data. First, let's start with some boundaries fitting to my data and an arbitrary grid size. The original data has datapoints also outside those x and y boundaries.
#determine grid boundaries
gridsize = 500
x_min = -8
x_max = 2.5
y_min = -2
y_max = 7
So we have defined a grid with 500 pixels between the min and max values of x and y.
In my data, there are lots more than the 500 values available in the area of high interest; whereas in the low-interest-area, there are not even 200 values in the total grid; between the graphic boundaries of x_min and x_max there are even less.
So for getting a nice picture, the task is to get an average for the high interest values and to fill the gaps elsewhere.
I define my grid now. For each xx-yy pair, i want to have a color.
xx = np.linspace(x_min, x_max, gridsize) # array of x values
yy = np.linspace(y_min, y_max, gridsize) # array of y values
grid = np.array(np.meshgrid(xx, yy.T))
grid = grid.reshape(2, grid.shape[1]*grid.shape[2]).T
Why the strange shape? scipy.griddata wants a shape of (n, D).
Griddata calculates one value per point in the grid, by a predefined method.
I choose "nearest" - empty grid points will be filled with values from the nearest neighbor. This looks as if the areas with less information have bigger cells (even if it is not the case). One could choose to interpolate "linear", then areas with less information look less sharp. Matter of taste, really.
points = np.array([x, y]).T # because griddata wants it that way
z_grid2 = griddata(points, z, grid, method='nearest')
# you get a 1D vector as result. Reshape to picture format!
z_grid2 = z_grid2.reshape(xx.shape[0], yy.shape[0])
And hop, we hand over to matplotlib to display the plot
fig = plt.figure(1, figsize=(10, 10))
ax1 = fig.add_subplot(111)
ax1.imshow(z_grid2, extent=[x_min, x_max,y_min, y_max, ],
origin='lower', cmap=cm.magma)
ax1.set_title("SVC: empty spots filled by nearest neighbours")
ax1.set_xlabel('log gamma')
ax1.set_ylabel('log C')
plt.show()
Around the pointy part of the V-Shape, you see I did a lot of calculations during my search for the sweet spot, whereas the less interesting parts almost everywhere else have a lower resolution.
Make a 2-dimensional array that corresponds to the cells in your final image, called say heatmap_cells and instantiate it as all zeroes.
Choose two scaling factors that define the difference between each array element in real units, for each dimension, say x_scale and y_scale. Choose these such that all your datapoints will fall within the bounds of the heatmap array.
For each raw datapoint with x_value and y_value:
heatmap_cells[floor(x_value/x_scale),floor(y_value/y_scale)]+=1
Very similar to #Piti's answer, but using 1 call instead of 2 to generate the points:
import numpy as np
import matplotlib.pyplot as plt
pts = 1000000
mean = [0.0, 0.0]
cov = [[1.0,0.0],[0.0,1.0]]
x,y = np.random.multivariate_normal(mean, cov, pts).T
plt.hist2d(x, y, bins=50, cmap=plt.cm.jet)
plt.show()
Output:
Here's one I made on a 1 Million point set with 3 categories (colored Red, Green, and Blue). Here's a link to the repository if you'd like to try the function. Github Repo
histplot(
X,
Y,
labels,
bins=2000,
range=((-3,3),(-3,3)),
normalize_each_label=True,
colors = [
[1,0,0],
[0,1,0],
[0,0,1]],
gain=50)
I'm afraid I'm a little late to the party but I had a similar question a while ago. The accepted answer (by #ptomato) helped me out but I'd also want to post this in case it's of use to someone.
''' I wanted to create a heatmap resembling a football pitch which would show the different actions performed '''
import numpy as np
import matplotlib.pyplot as plt
import random
#fixing random state for reproducibility
np.random.seed(1234324)
fig = plt.figure(12)
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
#Ratio of the pitch with respect to UEFA standards
hmap= np.full((6, 10), 0)
#print(hmap)
xlist = np.random.uniform(low=0.0, high=100.0, size=(20))
ylist = np.random.uniform(low=0.0, high =100.0, size =(20))
#UEFA Pitch Standards are 105m x 68m
xlist = (xlist/100)*10.5
ylist = (ylist/100)*6.5
ax1.scatter(xlist,ylist)
#int of the co-ordinates to populate the array
xlist_int = xlist.astype (int)
ylist_int = ylist.astype (int)
#print(xlist_int, ylist_int)
for i, j in zip(xlist_int, ylist_int):
#this populates the array according to the x,y co-ordinate values it encounters
hmap[j][i]= hmap[j][i] + 1
#Reversing the rows is necessary
hmap = hmap[::-1]
#print(hmap)
im = ax2.imshow(hmap)
Here's the result
None of these solutions worked for my application, so this is what I came up with. Essentially I am placing a 2D Gaussian at every single point:
import cv2
import numpy as np
import matplotlib.pyplot as plt
def getGaussian2D(ksize, sigma, norm=True):
oneD = cv2.getGaussianKernel(ksize=ksize, sigma=sigma)
twoD = np.outer(oneD.T, oneD)
return twoD / np.sum(twoD) if norm else twoD
def pt2heat(pts, shape, kernel=16, sigma=5):
heat = np.zeros(shape)
k = getGaussian2D(kernel, sigma)
for y,x in pts:
x, y = int(x), int(y)
for i in range(-kernel//2, kernel//2):
for j in range(-kernel//2, kernel//2):
if 0 <= x+i < shape[0] and 0 <= y+j < shape[1]:
heat[x+i, y+j] = heat[x+i, y+j] + k[i+kernel//2, j+kernel//2]
return heat
heat = pts2heat(pts, img.shape[:2])
plt.imshow(heat, cmap='heat')
Here are the points overlayed ontop of it's associated image, along with the resulting heat map:

Categories

Resources