matplotlib, rectangular plot in polar axes - python

Is it possible to have a standard rectangular (aka Cartesian) plot, but displayed on a polar set of axes?
I just want the masking appearance of the grey border provided by the polar() function, but I do not want to convert my coordinates to polar, and use polar().

If you just want to be able to call two arrays in rectangular coordinates, while plotting on a polar grid, this will give you what you need. Here, x and y are your arrays that, in rectangular, would give you an x=y line. It returns the same trend line but on in polar coordinates. If you require more of a mask, that actually limits some of the data being presented, then insert a line in the for loop that says something like: for r < 5.0: or whatever your criteria is for the mask.
from math import cos, sin, atan2, sqrt
import matplotlib.pyplot as plt
plt.clf()
width, height = matplotlib.rcParams['figure.figsize']
size = min(width, height)
# make a square figure
fig = figure(figsize=(size, size))
ax = fig.add_axes([0.1, 0.1, 0.8, 0.8], polar=True, axisbg='#d5de9c')
x = range(10)
y = range(10)
r = []
phi = []
for ii in range(len(x)):
r.append(sqrt(x[ii]**2.+y[ii]**2.))
phi.append(atan2(y[ii],x[ii]))
ax.scatter(phi,r)
show()

Related

Heatmap for given areas by coordinates

Let's say there is a 100*100 grid in the range of [0,1]. Each point in this grid with coordinates (x,y) has some value Z, where x and y can take any of the values {0.01, 0.02, 0.03, …, 1}. However, in the chart below, only points in certain areas matter (e.g. the rectangles in chart below).
I have a method F that when given a point with its coordinates tells me whether that point matters, I used a rectangle in the chart below just for demonstration - it’s not necessarily a rectangle. So I can calculate the centers of all the points for which I need the heatmap for - the rest should be white.
I was thinking maybe I can first build a 2D array of size 100*100 and populate it with the corresponding Z value for that coordinate, but then using the method F, set the Z for coordinates outside of those rectangles to some number (0, or -10000) that will result in white in the final heatmap, but couldn't quite make it happen.
Any help would be appreciated.
If all you are trying to do is blank out some of the image, just set it to NaN:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
data = np.random.randn(100, 100)
x = np.arange(100)
y = np.arange(100)
X, Y = np.meshgrid(x,y)
data[((X**2 + (Y-30)**2) > 25**2)
& ~((X>20) & (X<83) & (Y>60) & (Y<85))] = np.NaN
ax.imshow(data)
imshow has an extent argument that you can use to arbitrarily place images on the axes.
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
ax.imshow(np.random.randn(4, 5), extent=(3, 8, 2, 6))
ax.imshow(np.random.randn(7, 3), extent=(1, 2.7, 6.3, 9.8))
ax.set_xlim(0, 10)
ax.set_ylim(0, 10)
plt.show()
You can do the same thing with pcolormesh.

How to plot only some cells with certain values of an array and others not with matplotlib.pyplot?

I have an image that consists of float values and another one that consist only of ones and zeros. I want to plot the second image over the first one, but I only want to plot the ones from the second image. The zeros shall not be plotted.
Ì have tried the following code and I also changed the alpha of y to 1. The problem is, that either the red windows of y are changed from x (alpha of y = 0.5), or one can not even see the plots of x (alpha of y=1).
import matplotlib.pyplot as plt
import numpy as np
x = np.random.random(size=(20,20))
y = np.random.randint(2, size=(20,20))
fig = plt.figure()
plt.imshow(x, cmap="Greys", alpha = 0.5)
plt.imshow(y, cmap="Reds", alpha = 0.5)
plt.show()
How can I only plot the ones of y?
UPDATE:
Thank you for your answers! But this is not want I am looking for. I will explain again:
The result should be something like: x as background and every position, where y is 1, should be colored pure red.
Following the approach in this answer linked by #ImportanceOfBeingEarnest, the exact solution in your case would look like below. Here, np.ma.masked_where will mask your y array at places where it is 0. The resulting array will only contain 1.
EDIT: The problem of overlaying seems to stem from the choice of cmap. If you don't specify the cmap for the y, you can clearly see below that indeed only 1's are plotted and overlaid on the top of x. In order to have a discrete color (red in your case), you can create a custom color map
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import colors
x = np.random.random(size=(20,20))
y = np.random.randint(2, size=(20,20))
y_new =np.ma.masked_where(y==0, y)
cmap = colors.ListedColormap(['red'])
fig = plt.figure()
plt.imshow(x, cmap="Greys", alpha = 0.5)
plt.imshow(y_new, cmap=cmap, alpha=1)
plt.show()
We can inverse "The result should be [..] x as background and every position, where y is 1, should be colored pure red.", namely to just plot x, masked by y and set the background to red.
import matplotlib.pyplot as plt
import numpy as np
y = np.random.randint(2, size=(20,20))
x = np.random.random(size=(20,20))
X = np.ma.array(x, mask=y)
fig = plt.figure()
plt.imshow(X, cmap="Greys")
plt.gca().set_facecolor("red")
plt.show()
There are of course related Q&As like
Matplotlib imshow: how to apply a mask on the matrix or
How can I plot NaN values as a special color with imshow in matplotlib?
and there is also an example on the matplotlib page: Image masked

Delaunay Triangulation of points from 2D surface in 3D with python?

I have a collection of 3D points. These points are sampled at constant levels (z=0,1,...,7). An image should make it clear:
These points are in a numpy ndarray of shape (N, 3) called X. The above plot is created using:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
X = load('points.npy')
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_wireframe(X[:,0], X[:,1], X[:,2])
ax.scatter(X[:,0], X[:,1], X[:,2])
plt.draw()
I'd like to instead triangulate only the surface of this object, and plot the surface. I do not want the convex hull of this object, however, because this loses subtle shape information I'd like to be able to inspect.
I have tried ax.plot_trisurf(X[:,0], X[:,1], X[:,2]), but this results in the following mess:
Any help?
Example data
Here's a snippet to generate 3D data that is representative of the problem:
import numpy as np
X = []
for i in range(8):
t = np.linspace(0,2*np.pi,np.random.randint(30,50))
for j in range(t.shape[0]):
# random circular objects...
X.append([
(-0.05*(i-3.5)**2+1)*np.cos(t[j])+0.1*np.random.rand()-0.05,
(-0.05*(i-3.5)**2+1)*np.sin(t[j])+0.1*np.random.rand()-0.05,
i
])
X = np.array(X)
Example data from original image
Here's a pastebin to the original data:
http://pastebin.com/YBZhJcsV
Here are the slices along constant z:
update 3
Here's a concrete example of what I describe in update 2. If you don't have mayavi for visualization, I suggest installing it via edm using edm install mayavi pyqt matplotlib.
Toy 2D contours stacked in 3D
Contours -> 3D surface
Code to generate the figures
from matplotlib import path as mpath
from mayavi import mlab
import numpy as np
def make_star(amplitude=1.0, rotation=0.0):
""" Make a star shape
"""
t = np.linspace(0, 2*np.pi, 6) + rotation
star = np.zeros((12, 2))
star[::2] = np.c_[np.cos(t), np.sin(t)]
star[1::2] = 0.5*np.c_[np.cos(t + np.pi / 5), np.sin(t + np.pi / 5)]
return amplitude * star
def make_stars(n_stars=51, z_diff=0.05):
""" Make `2*n_stars-1` stars stacked in 3D
"""
amps = np.linspace(0.25, 1, n_stars)
amps = np.r_[amps, amps[:-1][::-1]]
rots = np.linspace(0, 2*np.pi, len(amps))
zamps = np.linspace
stars = []
for i, (amp, rot) in enumerate(zip(amps, rots)):
star = make_star(amplitude=amp, rotation=rot)
height = i*z_diff
z = np.full(len(star), height)
star3d = np.c_[star, z]
stars.append(star3d)
return stars
def polygon_to_boolean(points, xvals, yvals):
""" Convert `points` to a boolean indicator mask
over the specified domain
"""
x, y = np.meshgrid(xvals, yvals)
xy = np.c_[x.flatten(), y.flatten()]
mask = mpath.Path(points).contains_points(xy).reshape(x.shape)
return x, y, mask
def plot_contours(stars):
""" Plot a list of stars in 3D
"""
n = len(stars)
for i, star in enumerate(stars):
x, y, z = star.T
mlab.plot3d(*star.T)
#ax.plot3D(x, y, z, '-o', c=(0, 1-i/n, i/n))
#ax.set_xlim(-1, 1)
#ax.set_ylim(-1, 1)
mlab.show()
if __name__ == '__main__':
# Make and plot the 2D contours
stars3d = make_stars()
plot_contours(stars3d)
xvals = np.linspace(-1, 1, 101)
yvals = np.linspace(-1, 1, 101)
volume = np.dstack([
polygon_to_boolean(star[:,:2], xvals, yvals)[-1]
for star in stars3d
]).astype(float)
mlab.contour3d(volume, contours=[0.5])
mlab.show()
update 2
I now do this as follows:
I use the fact that the paths in each z-slice are closed and simple and use matplotlib.path to determine points inside and outside of the contour. Using this idea, I convert the contours in each slice to a boolean-valued image, which is combined into a boolean-valued volume.
Next, I use skimage's marching_cubes method to obtain a triangulation of the surface for visualization.
Here's an example of the method. I think the data is slightly different, but you can definitely see that the results are much cleaner, and can handle surfaces that are disconnected or have holes.
Original answer
Ok, here's the solution I came up with. It depends heavily on my data being roughly spherical and sampled at uniformly in z I think. Some of the other comments provide more information about more robust solutions. Since my data is roughly spherical I triangulate the azimuth and zenith angles from the spherical coordinate transform of my data points.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.tri as mtri
X = np.load('./mydatars.npy')
# My data points are strictly positive. This doesn't work if I don't center about the origin.
X -= X.mean(axis=0)
rad = np.linalg.norm(X, axis=1)
zen = np.arccos(X[:,-1] / rad)
azi = np.arctan2(X[:,1], X[:,0])
tris = mtri.Triangulation(zen, azi)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_trisurf(X[:,0], X[:,1], X[:,2], triangles=tris.triangles, cmap=plt.cm.bone)
plt.show()
Using the sample data from the pastebin above, this yields:
I realise that you mentioned in your question that you didn't want to use the convex hull because you might lose some shape information. I have a simple solution that works pretty well for your 'jittered spherical' example data, although it does use scipy.spatial.ConvexHull. I thought I would share it here anyway, just in case it's useful for others:
from matplotlib.tri import triangulation
from scipy.spatial import ConvexHull
# compute the convex hull of the points
cvx = ConvexHull(X)
x, y, z = X.T
# cvx.simplices contains an (nfacets, 3) array specifying the indices of
# the vertices for each simplical facet
tri = Triangulation(x, y, triangles=cvx.simplices)
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.hold(True)
ax.plot_trisurf(tri, z)
ax.plot_wireframe(x, y, z, color='r')
ax.scatter(x, y, z, color='r')
plt.draw()
It does pretty well in this case, since your example data ends up lying on a more-or-less convex surface. Perhaps you could make some more challenging example data? A toroidal surface would be a good test case which the convex hull method would obviously fail.
Mapping an arbitrary 3D surface from a point cloud is a really tough problem. Here's a related question containing some links that might be helpful.

Set a colormap under a graph

I know this is well documented, but I'm struggling to implement this in my code.
I would like to shade the area under my graph with a colormap. Is it possible to have a colour, i.e. red from any points over 30, and a gradient up until that point?
I am using the method fill_between, but I'm happy to change this if there is a better way to do it.
def plot(sd_values):
plt.figure()
sd_values=np.array(sd_values)
x=np.arange(len(sd_values))
plt.plot(x,sd_values, linewidth=1)
plt.fill_between(x,sd_values, cmap=plt.cm.jet)
plt.show()
This is the result at the moment. I have tried axvspan, but this doesnt have cmap as an option. Why does the below graph not show a colormap?
I'm not sure if the cmap argument should be part of the fill_between plotting command. In your case probably want to use the fill() command btw.
These fill commands create polygons or polygon collections. A polygon collection can take a cmap but with fill there is no way of providing the data on which it should be colored.
What's (for as far as i know) certainly not possible is to fill a single polygon with a gradient as you wish.
The next best thing is to fake it. You can plot a shaded image and clip it based on the created polygon.
# create some sample data
x = np.linspace(0, 1)
y = np.sin(4 * np.pi * x) * np.exp(-5 * x) * 120
fig, ax = plt.subplots()
# plot only the outline of the polygon, and capture the result
poly, = ax.fill(x, y, facecolor='none')
# get the extent of the axes
xmin, xmax = ax.get_xlim()
ymin, ymax = ax.get_ylim()
# create a dummy image
img_data = np.arange(ymin,ymax,(ymax-ymin)/100.)
img_data = img_data.reshape(img_data.size,1)
# plot and clip the image
im = ax.imshow(img_data, aspect='auto', origin='lower', cmap=plt.cm.Reds_r, extent=[xmin,xmax,ymin,ymax], vmin=y.min(), vmax=30.)
im.set_clip_path(poly)
The image is given an extent which basically stretches it over the entire axes. Then the clip_path makes it only showup where the fill polygon is drawn.
I think all you need is to do the plot of the data one at a time, like:
import numpy
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors
# Create fake data
x = numpy.linspace(0,4)
y = numpy.exp(x)
# Now plot one by one
bar_width = x[1] - x[0] # assuming x is linealy spaced
for pointx, pointy in zip(x,y):
current_color = cm.jet( min(pointy/30, 30)) # maximum of 30
plt.bar(pointx, pointy, bar_width, color = current_color)
plt.show()
Resulting in:

Generate a heatmap using a scatter data set

I have a set of X,Y data points (about 10k) that are easy to plot as a scatter plot but that I would like to represent as a heatmap.
I looked through the examples in Matplotlib and they all seem to already start with heatmap cell values to generate the image.
Is there a method that converts a bunch of x, y, all different, to a heatmap (where zones with higher frequency of x, y would be "warmer")?
If you don't want hexagons, you can use numpy's histogram2d function:
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
# Generate some test data
x = np.random.randn(8873)
y = np.random.randn(8873)
heatmap, xedges, yedges = np.histogram2d(x, y, bins=50)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.clf()
plt.imshow(heatmap.T, extent=extent, origin='lower')
plt.show()
This makes a 50x50 heatmap. If you want, say, 512x384, you can put bins=(512, 384) in the call to histogram2d.
Example:
In Matplotlib lexicon, i think you want a hexbin plot.
If you're not familiar with this type of plot, it's just a bivariate histogram in which the xy-plane is tessellated by a regular grid of hexagons.
So from a histogram, you can just count the number of points falling in each hexagon, discretiize the plotting region as a set of windows, assign each point to one of these windows; finally, map the windows onto a color array, and you've got a hexbin diagram.
Though less commonly used than e.g., circles, or squares, that hexagons are a better choice for the geometry of the binning container is intuitive:
hexagons have nearest-neighbor symmetry (e.g., square bins don't,
e.g., the distance from a point on a square's border to a point
inside that square is not everywhere equal) and
hexagon is the highest n-polygon that gives regular plane
tessellation (i.e., you can safely re-model your kitchen floor with hexagonal-shaped tiles because you won't have any void space between the tiles when you are finished--not true for all other higher-n, n >= 7, polygons).
(Matplotlib uses the term hexbin plot; so do (AFAIK) all of the plotting libraries for R; still i don't know if this is the generally accepted term for plots of this type, though i suspect it's likely given that hexbin is short for hexagonal binning, which is describes the essential step in preparing the data for display.)
from matplotlib import pyplot as PLT
from matplotlib import cm as CM
from matplotlib import mlab as ML
import numpy as NP
n = 1e5
x = y = NP.linspace(-5, 5, 100)
X, Y = NP.meshgrid(x, y)
Z1 = ML.bivariate_normal(X, Y, 2, 2, 0, 0)
Z2 = ML.bivariate_normal(X, Y, 4, 1, 1, 1)
ZD = Z2 - Z1
x = X.ravel()
y = Y.ravel()
z = ZD.ravel()
gridsize=30
PLT.subplot(111)
# if 'bins=None', then color of each hexagon corresponds directly to its count
# 'C' is optional--it maps values to x-y coordinates; if 'C' is None (default) then
# the result is a pure 2D histogram
PLT.hexbin(x, y, C=z, gridsize=gridsize, cmap=CM.jet, bins=None)
PLT.axis([x.min(), x.max(), y.min(), y.max()])
cb = PLT.colorbar()
cb.set_label('mean value')
PLT.show()
Edit: For a better approximation of Alejandro's answer, see below.
I know this is an old question, but wanted to add something to Alejandro's anwser: If you want a nice smoothed image without using py-sphviewer you can instead use np.histogram2d and apply a gaussian filter (from scipy.ndimage.filters) to the heatmap:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy.ndimage.filters import gaussian_filter
def myplot(x, y, s, bins=1000):
heatmap, xedges, yedges = np.histogram2d(x, y, bins=bins)
heatmap = gaussian_filter(heatmap, sigma=s)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
return heatmap.T, extent
fig, axs = plt.subplots(2, 2)
# Generate some test data
x = np.random.randn(1000)
y = np.random.randn(1000)
sigmas = [0, 16, 32, 64]
for ax, s in zip(axs.flatten(), sigmas):
if s == 0:
ax.plot(x, y, 'k.', markersize=5)
ax.set_title("Scatter plot")
else:
img, extent = myplot(x, y, s)
ax.imshow(img, extent=extent, origin='lower', cmap=cm.jet)
ax.set_title("Smoothing with $\sigma$ = %d" % s)
plt.show()
Produces:
The scatter plot and s=16 plotted on top of eachother for Agape Gal'lo (click for better view):
One difference I noticed with my gaussian filter approach and Alejandro's approach was that his method shows local structures much better than mine. Therefore I implemented a simple nearest neighbour method at pixel level. This method calculates for each pixel the inverse sum of the distances of the n closest points in the data. This method is at a high resolution pretty computationally expensive and I think there's a quicker way, so let me know if you have any improvements.
Update: As I suspected, there's a much faster method using Scipy's scipy.cKDTree. See Gabriel's answer for the implementation.
Anyway, here's my code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
def data_coord2view_coord(p, vlen, pmin, pmax):
dp = pmax - pmin
dv = (p - pmin) / dp * vlen
return dv
def nearest_neighbours(xs, ys, reso, n_neighbours):
im = np.zeros([reso, reso])
extent = [np.min(xs), np.max(xs), np.min(ys), np.max(ys)]
xv = data_coord2view_coord(xs, reso, extent[0], extent[1])
yv = data_coord2view_coord(ys, reso, extent[2], extent[3])
for x in range(reso):
for y in range(reso):
xp = (xv - x)
yp = (yv - y)
d = np.sqrt(xp**2 + yp**2)
im[y][x] = 1 / np.sum(d[np.argpartition(d.ravel(), n_neighbours)[:n_neighbours]])
return im, extent
n = 1000
xs = np.random.randn(n)
ys = np.random.randn(n)
resolution = 250
fig, axes = plt.subplots(2, 2)
for ax, neighbours in zip(axes.flatten(), [0, 16, 32, 64]):
if neighbours == 0:
ax.plot(xs, ys, 'k.', markersize=2)
ax.set_aspect('equal')
ax.set_title("Scatter Plot")
else:
im, extent = nearest_neighbours(xs, ys, resolution, neighbours)
ax.imshow(im, origin='lower', extent=extent, cmap=cm.jet)
ax.set_title("Smoothing over %d neighbours" % neighbours)
ax.set_xlim(extent[0], extent[1])
ax.set_ylim(extent[2], extent[3])
plt.show()
Result:
Instead of using np.hist2d, which in general produces quite ugly histograms, I would like to recycle py-sphviewer, a python package for rendering particle simulations using an adaptive smoothing kernel and that can be easily installed from pip (see webpage documentation). Consider the following code, which is based on the example:
import numpy as np
import numpy.random
import matplotlib.pyplot as plt
import sphviewer as sph
def myplot(x, y, nb=32, xsize=500, ysize=500):
xmin = np.min(x)
xmax = np.max(x)
ymin = np.min(y)
ymax = np.max(y)
x0 = (xmin+xmax)/2.
y0 = (ymin+ymax)/2.
pos = np.zeros([len(x),3])
pos[:,0] = x
pos[:,1] = y
w = np.ones(len(x))
P = sph.Particles(pos, w, nb=nb)
S = sph.Scene(P)
S.update_camera(r='infinity', x=x0, y=y0, z=0,
xsize=xsize, ysize=ysize)
R = sph.Render(S)
R.set_logscale()
img = R.get_image()
extent = R.get_extent()
for i, j in zip(xrange(4), [x0,x0,y0,y0]):
extent[i] += j
print extent
return img, extent
fig = plt.figure(1, figsize=(10,10))
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
# Generate some test data
x = np.random.randn(1000)
y = np.random.randn(1000)
#Plotting a regular scatter plot
ax1.plot(x,y,'k.', markersize=5)
ax1.set_xlim(-3,3)
ax1.set_ylim(-3,3)
heatmap_16, extent_16 = myplot(x,y, nb=16)
heatmap_32, extent_32 = myplot(x,y, nb=32)
heatmap_64, extent_64 = myplot(x,y, nb=64)
ax2.imshow(heatmap_16, extent=extent_16, origin='lower', aspect='auto')
ax2.set_title("Smoothing over 16 neighbors")
ax3.imshow(heatmap_32, extent=extent_32, origin='lower', aspect='auto')
ax3.set_title("Smoothing over 32 neighbors")
#Make the heatmap using a smoothing over 64 neighbors
ax4.imshow(heatmap_64, extent=extent_64, origin='lower', aspect='auto')
ax4.set_title("Smoothing over 64 neighbors")
plt.show()
which produces the following image:
As you see, the images look pretty nice, and we are able to identify different substructures on it. These images are constructed spreading a given weight for every point within a certain domain, defined by the smoothing length, which in turns is given by the distance to the closer nb neighbor (I've chosen 16, 32 and 64 for the examples). So, higher density regions typically are spread over smaller regions compared to lower density regions.
The function myplot is just a very simple function that I've written in order to give the x,y data to py-sphviewer to do the magic.
If you are using 1.2.x
import numpy as np
import matplotlib.pyplot as plt
x = np.random.randn(100000)
y = np.random.randn(100000)
plt.hist2d(x,y,bins=100)
plt.show()
Seaborn now has the jointplot function which should work nicely here:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
# Generate some test data
x = np.random.randn(8873)
y = np.random.randn(8873)
sns.jointplot(x=x, y=y, kind='hex')
plt.show()
Here's Jurgy's great nearest neighbour approach but implemented using scipy.cKDTree. In my tests it's about 100x faster.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from scipy.spatial import cKDTree
def data_coord2view_coord(p, resolution, pmin, pmax):
dp = pmax - pmin
dv = (p - pmin) / dp * resolution
return dv
n = 1000
xs = np.random.randn(n)
ys = np.random.randn(n)
resolution = 250
extent = [np.min(xs), np.max(xs), np.min(ys), np.max(ys)]
xv = data_coord2view_coord(xs, resolution, extent[0], extent[1])
yv = data_coord2view_coord(ys, resolution, extent[2], extent[3])
def kNN2DDens(xv, yv, resolution, neighbours, dim=2):
"""
"""
# Create the tree
tree = cKDTree(np.array([xv, yv]).T)
# Find the closest nnmax-1 neighbors (first entry is the point itself)
grid = np.mgrid[0:resolution, 0:resolution].T.reshape(resolution**2, dim)
dists = tree.query(grid, neighbours)
# Inverse of the sum of distances to each grid point.
inv_sum_dists = 1. / dists[0].sum(1)
# Reshape
im = inv_sum_dists.reshape(resolution, resolution)
return im
fig, axes = plt.subplots(2, 2, figsize=(15, 15))
for ax, neighbours in zip(axes.flatten(), [0, 16, 32, 63]):
if neighbours == 0:
ax.plot(xs, ys, 'k.', markersize=5)
ax.set_aspect('equal')
ax.set_title("Scatter Plot")
else:
im = kNN2DDens(xv, yv, resolution, neighbours)
ax.imshow(im, origin='lower', extent=extent, cmap=cm.Blues)
ax.set_title("Smoothing over %d neighbours" % neighbours)
ax.set_xlim(extent[0], extent[1])
ax.set_ylim(extent[2], extent[3])
plt.savefig('new.png', dpi=150, bbox_inches='tight')
and the initial question was... how to convert scatter values to grid values, right?
histogram2d does count the frequency per cell, however, if you have other data per cell than just the frequency, you'd need some additional work to do.
x = data_x # between -10 and 4, log-gamma of an svc
y = data_y # between -4 and 11, log-C of an svc
z = data_z #between 0 and 0.78, f1-values from a difficult dataset
So, I have a dataset with Z-results for X and Y coordinates. However, I was calculating few points outside the area of interest (large gaps), and heaps of points in a small area of interest.
Yes here it becomes more difficult but also more fun. Some libraries (sorry):
from matplotlib import pyplot as plt
from matplotlib import cm
import numpy as np
from scipy.interpolate import griddata
pyplot is my graphic engine today,
cm is a range of color maps with some initeresting choice.
numpy for the calculations,
and griddata for attaching values to a fixed grid.
The last one is important especially because the frequency of xy points is not equally distributed in my data. First, let's start with some boundaries fitting to my data and an arbitrary grid size. The original data has datapoints also outside those x and y boundaries.
#determine grid boundaries
gridsize = 500
x_min = -8
x_max = 2.5
y_min = -2
y_max = 7
So we have defined a grid with 500 pixels between the min and max values of x and y.
In my data, there are lots more than the 500 values available in the area of high interest; whereas in the low-interest-area, there are not even 200 values in the total grid; between the graphic boundaries of x_min and x_max there are even less.
So for getting a nice picture, the task is to get an average for the high interest values and to fill the gaps elsewhere.
I define my grid now. For each xx-yy pair, i want to have a color.
xx = np.linspace(x_min, x_max, gridsize) # array of x values
yy = np.linspace(y_min, y_max, gridsize) # array of y values
grid = np.array(np.meshgrid(xx, yy.T))
grid = grid.reshape(2, grid.shape[1]*grid.shape[2]).T
Why the strange shape? scipy.griddata wants a shape of (n, D).
Griddata calculates one value per point in the grid, by a predefined method.
I choose "nearest" - empty grid points will be filled with values from the nearest neighbor. This looks as if the areas with less information have bigger cells (even if it is not the case). One could choose to interpolate "linear", then areas with less information look less sharp. Matter of taste, really.
points = np.array([x, y]).T # because griddata wants it that way
z_grid2 = griddata(points, z, grid, method='nearest')
# you get a 1D vector as result. Reshape to picture format!
z_grid2 = z_grid2.reshape(xx.shape[0], yy.shape[0])
And hop, we hand over to matplotlib to display the plot
fig = plt.figure(1, figsize=(10, 10))
ax1 = fig.add_subplot(111)
ax1.imshow(z_grid2, extent=[x_min, x_max,y_min, y_max, ],
origin='lower', cmap=cm.magma)
ax1.set_title("SVC: empty spots filled by nearest neighbours")
ax1.set_xlabel('log gamma')
ax1.set_ylabel('log C')
plt.show()
Around the pointy part of the V-Shape, you see I did a lot of calculations during my search for the sweet spot, whereas the less interesting parts almost everywhere else have a lower resolution.
Make a 2-dimensional array that corresponds to the cells in your final image, called say heatmap_cells and instantiate it as all zeroes.
Choose two scaling factors that define the difference between each array element in real units, for each dimension, say x_scale and y_scale. Choose these such that all your datapoints will fall within the bounds of the heatmap array.
For each raw datapoint with x_value and y_value:
heatmap_cells[floor(x_value/x_scale),floor(y_value/y_scale)]+=1
Very similar to #Piti's answer, but using 1 call instead of 2 to generate the points:
import numpy as np
import matplotlib.pyplot as plt
pts = 1000000
mean = [0.0, 0.0]
cov = [[1.0,0.0],[0.0,1.0]]
x,y = np.random.multivariate_normal(mean, cov, pts).T
plt.hist2d(x, y, bins=50, cmap=plt.cm.jet)
plt.show()
Output:
Here's one I made on a 1 Million point set with 3 categories (colored Red, Green, and Blue). Here's a link to the repository if you'd like to try the function. Github Repo
histplot(
X,
Y,
labels,
bins=2000,
range=((-3,3),(-3,3)),
normalize_each_label=True,
colors = [
[1,0,0],
[0,1,0],
[0,0,1]],
gain=50)
I'm afraid I'm a little late to the party but I had a similar question a while ago. The accepted answer (by #ptomato) helped me out but I'd also want to post this in case it's of use to someone.
''' I wanted to create a heatmap resembling a football pitch which would show the different actions performed '''
import numpy as np
import matplotlib.pyplot as plt
import random
#fixing random state for reproducibility
np.random.seed(1234324)
fig = plt.figure(12)
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
#Ratio of the pitch with respect to UEFA standards
hmap= np.full((6, 10), 0)
#print(hmap)
xlist = np.random.uniform(low=0.0, high=100.0, size=(20))
ylist = np.random.uniform(low=0.0, high =100.0, size =(20))
#UEFA Pitch Standards are 105m x 68m
xlist = (xlist/100)*10.5
ylist = (ylist/100)*6.5
ax1.scatter(xlist,ylist)
#int of the co-ordinates to populate the array
xlist_int = xlist.astype (int)
ylist_int = ylist.astype (int)
#print(xlist_int, ylist_int)
for i, j in zip(xlist_int, ylist_int):
#this populates the array according to the x,y co-ordinate values it encounters
hmap[j][i]= hmap[j][i] + 1
#Reversing the rows is necessary
hmap = hmap[::-1]
#print(hmap)
im = ax2.imshow(hmap)
Here's the result
None of these solutions worked for my application, so this is what I came up with. Essentially I am placing a 2D Gaussian at every single point:
import cv2
import numpy as np
import matplotlib.pyplot as plt
def getGaussian2D(ksize, sigma, norm=True):
oneD = cv2.getGaussianKernel(ksize=ksize, sigma=sigma)
twoD = np.outer(oneD.T, oneD)
return twoD / np.sum(twoD) if norm else twoD
def pt2heat(pts, shape, kernel=16, sigma=5):
heat = np.zeros(shape)
k = getGaussian2D(kernel, sigma)
for y,x in pts:
x, y = int(x), int(y)
for i in range(-kernel//2, kernel//2):
for j in range(-kernel//2, kernel//2):
if 0 <= x+i < shape[0] and 0 <= y+j < shape[1]:
heat[x+i, y+j] = heat[x+i, y+j] + k[i+kernel//2, j+kernel//2]
return heat
heat = pts2heat(pts, img.shape[:2])
plt.imshow(heat, cmap='heat')
Here are the points overlayed ontop of it's associated image, along with the resulting heat map:

Categories

Resources