Creating Density/Heatmap Plot from Coordinates and Magnitude in Python - python

I have some data which is the number of readings at each point on a 5x10 grid, which is in the format of;
X = [1, 2, 3, 4,..., 5]
Y = [1, 1, 1, 1,...,10]
Z = [9,8,14,0,89,...,0]
I would like to plot this as a heatmap/density map from above, but all of the matplotlib graphs (incl. contourf) that I have found are requiring a 2D array for Z and I don't understand why.
EDIT;
I have now collected the actual coordinates that I want to plot which are not as regular as what I have above they are;
X = [8,7,7,7,8,8,8,9,9.5,9.5,9.5,11,11,11,10.5,
10.5,10.5,10.5,9,9,8, 8,8,8,6.5,6.5,1,2.5,4.5,
4.5,2,2,2,3,3,3,4,4.5,4.5,4.5,4.5,3.5,2.5,2.5,
1,1,1,2,2,2]
Y = [5.5,7.5,8,9,9,8,7.5,6,6.5,8,9,9,8,6.5,5.5,
5,3.5,2,2,1,2,3.5,5,1,1,2,4.5,4.5,4.5,4,3,
2,1,1,2,3,4.5,3.5,2.5,1.5,1,5.5,5.5,6,7,8,9,
9,8,7]
z = [286,257,75,38,785,3074,1878,1212,2501,1518,419,33,
3343,1808,3233,5943,10511,3593,1086,139,565,61,61,
189,155,105,120,225,682,416,30632,2035,165,6777,
7223,465,2510,7128,2296,1659,1358,204,295,854,7838,
122,5206,6516,221,282]
From what I understand you can't use floats in a np.array so I have tried to multiply all values by 10 so that they are all integers, but I am still running into some issues. Am I trying to do something that will not work?

They expect a 2D array because they use the "row" and "column" to set the position of the value. For example, if array[2, 3] = 5, then when x is 2 and y is 3, the heatmap will use the value 5.
So, let's try transforming your current data into a single array:
>>> array = np.empty((len(set(X)), len(set(Y))))
>>> for x, y, z in zip(X, Y, Z):
array[x-1, y-1] = z
If X and Y are np.arrays, you could do this too (SO answer):
>>> array = np.empty((X.shape[0], Y.shape[0]))
>>> array[np.array(X) - 1, np.array(Y) - 1] = Z
And now just plot the array as you prefer:
>>> plt.imshow(array, cmap="hot", interpolation="nearest")
>>> plt.show()

Related

Finding the Corners of the an array of coordinates

I have a 2D array of Coordinates in Numpy.
My goal is to attempt to find the corners (as if it were a square). So the :
Top left: smallest x, highest y
Top right: largest x, largest y
bottom left: smallest x, smallest y
bottom right: largest x, smallest y
Obviously each of these pairs need to consider the other values.
I was trying to take the min and max depending on the row:
BottomLeft = np.min(np.min(hull, axis=1), axis=0)
However, this does not keep the pair of values together. It would have to be something like the smallest possible X values, and out of those, the smallest y value. Or something along these lines.
I am assuming there is efficient way to do this with numpy?
Here is an example of data:
[[[260 156]]
[[248 176]]
[[235 197]]
[[233 199]]
[[192 199]]
[[174 197]]
[[160 171]]
[[150 151]]
[[154 149]]
[[156 149]]
[[260 151]]]
Thanks!
Per the discussion above, this assumes that one of the pairs with the smallest x value will also correspond to the smallest y value. So you can first find the minimum x-value:
# Some sample data
d = np.array([[3, 1, 4, 1, 5],
[8, 0, 4, 2, 3]])
# smallest value in the first row which, I assume, is your x-values
xm = np.min(d[0, :])
Then you can get the subset of values that have that minimum x value like so:
d[:, d[0,:] == 1]
So you can get the min of them via:
np.min(d[1, d[0,:] == 1])
suppose x coordinates are as follows
x = np.arange(0, 22, 2)
and suppose y coordinates are as follows
y = np.arange(20, 32, 2)
xx, yy = np.meshgrid(x, y)
yy = np.flip(yy, 0)
print(xx)
print(yy)
Then you can do whatever operation with xx and yy as they are coordinates.
for example,
let us assume z is the elevation
z = np.random.randint(2, high=20, size=(yy.shape[0], yy.shape[1])) # xx can also be used
import matplotlib.pyplot as plt
plt.contourf(xx, yy, z)
plt.colorbar()

How to curve fit multiple y vals for single x value?

I'm trying to use numpy to curve fit (polyfit) a data set I have - it's multiple y vals for discrete x vals, i.e.:
data = [[2, 3], [3, 4], [5, 4]]
where the index is x, and the arrays are the y vals.
I tried the average/median of each array, but I get the feeling that's ignoring a lot of useful data.
TLDR:
Need to fit a curve to this scatter plot:
You could flatten your data out:
x = []
y = []
for i,ydata in enumerate(data):
x += [i]*len(ydata)
y += ydata
Now you can fit to x and y and it will account for all points in the set.

interpolation between arrays in python

What is the easiest and fastest way to interpolate between two arrays to get new array.
For example, I have 3 arrays:
x = np.array([0,1,2,3,4,5])
y = np.array([5,4,3,2,1,0])
z = np.array([0,5])
x,y corresponds to data-points and z is an argument. So at z=0 x array is valid, and at z=5 y array valid. But I need to get new array for z=1. So it could be easily solved by:
a = (y-x)/(z[1]-z[0])*1+x
Problem is that data is not linearly dependent and there are more than 2 arrays with data. Maybe it is possible to use somehow spline interpolation?
This is a univariate to multivariate regression problem. Scipy supports univariate to univariate regression, and multivariate to univariate regression. But you can instead iterate over the outputs, so this is not such a big problem. Below is an example of how it can be done. I've changed the variable names a bit and added a new point:
import numpy as np
from scipy.interpolate import interp1d
X = np.array([0, 5, 10])
Y = np.array([[0, 1, 2, 3, 4, 5],
[5, 4, 3, 2, 1, 0],
[8, 6, 5, 1, -4, -5]])
XX = np.array([0, 1, 5]) # Find YY for these
YY = np.zeros((len(XX), Y.shape[1]))
for i in range(Y.shape[1]):
f = interp1d(X, Y[:, i])
for j in range(len(XX)):
YY[j, i] = f(XX[j])
So YY are the result for XX. Hope it helps.

Numpy array: concatenate arrays and integers

In my Python program I concatenate several integers and an array. It would be intuitive if this would work:
x,y,z = 1,2,np.array([3,3,3])
np.concatenate((x,y,z))
However, instead all ints have to be converted to np.arrays:
x,y,z = 1,2,np.array([3,3,3])
np.concatenate((np.array([x]),np.array([y]),z))
Especially if you have many variables this manual converting is tedious. The problem is that x and y are 0-dimensional arrays, while z is 1-dimensional. Is there any way to do the concatenation without the converting?
They just have to be sequence objects, not necessarily numpy arrays:
x,y,z = 1,2,np.array([3,3,3])
np.concatenate(([x],[y],z))
# array([1, 2, 3, 4, 5])
Numpy also does have an insert function that will do this:
x,y,z = 1,2,np.array([3,3,3])
np.insert(z, [0,0], [x, y])
I'll add that if you're just trying to add integers to an list, you don't need numpy to do it:
x,y,z = 1,2,[3,3,3]
z = [x] + [y] + z
or
x,y,z = 1,2,[3,3,3]
[x, y] + z
or
x,y,z = 1,2,[3,3,3]
z.insert(0, y)
z.insert(0, x)

Numpy meshgrid in 3D

Numpy's meshgrid is very useful for converting two vectors to a coordinate grid. What is the easiest way to extend this to three dimensions? So given three vectors x, y, and z, construct 3x3D arrays (instead of 2x2D arrays) which can be used as coordinates.
Numpy (as of 1.8 I think) now supports higher that 2D generation of position grids with meshgrid. One important addition which really helped me is the ability to chose the indexing order (either xy or ij for Cartesian or matrix indexing respectively), which I verified with the following example:
import numpy as np
x_ = np.linspace(0., 1., 10)
y_ = np.linspace(1., 2., 20)
z_ = np.linspace(3., 4., 30)
x, y, z = np.meshgrid(x_, y_, z_, indexing='ij')
assert np.all(x[:,0,0] == x_)
assert np.all(y[0,:,0] == y_)
assert np.all(z[0,0,:] == z_)
Here is the source code of meshgrid:
def meshgrid(x,y):
"""
Return coordinate matrices from two coordinate vectors.
Parameters
----------
x, y : ndarray
Two 1-D arrays representing the x and y coordinates of a grid.
Returns
-------
X, Y : ndarray
For vectors `x`, `y` with lengths ``Nx=len(x)`` and ``Ny=len(y)``,
return `X`, `Y` where `X` and `Y` are ``(Ny, Nx)`` shaped arrays
with the elements of `x` and y repeated to fill the matrix along
the first dimension for `x`, the second for `y`.
See Also
--------
index_tricks.mgrid : Construct a multi-dimensional "meshgrid"
using indexing notation.
index_tricks.ogrid : Construct an open multi-dimensional "meshgrid"
using indexing notation.
Examples
--------
>>> X, Y = np.meshgrid([1,2,3], [4,5,6,7])
>>> X
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> Y
array([[4, 4, 4],
[5, 5, 5],
[6, 6, 6],
[7, 7, 7]])
`meshgrid` is very useful to evaluate functions on a grid.
>>> x = np.arange(-5, 5, 0.1)
>>> y = np.arange(-5, 5, 0.1)
>>> xx, yy = np.meshgrid(x, y)
>>> z = np.sin(xx**2+yy**2)/(xx**2+yy**2)
"""
x = asarray(x)
y = asarray(y)
numRows, numCols = len(y), len(x) # yes, reversed
x = x.reshape(1,numCols)
X = x.repeat(numRows, axis=0)
y = y.reshape(numRows,1)
Y = y.repeat(numCols, axis=1)
return X, Y
It is fairly simple to understand. I extended the pattern to an arbitrary number of dimensions, but this code is by no means optimized (and not thoroughly error-checked either), but you get what you pay for. Hope it helps:
def meshgrid2(*arrs):
arrs = tuple(reversed(arrs)) #edit
lens = map(len, arrs)
dim = len(arrs)
sz = 1
for s in lens:
sz*=s
ans = []
for i, arr in enumerate(arrs):
slc = [1]*dim
slc[i] = lens[i]
arr2 = asarray(arr).reshape(slc)
for j, sz in enumerate(lens):
if j!=i:
arr2 = arr2.repeat(sz, axis=j)
ans.append(arr2)
return tuple(ans)
Can you show us how you are using np.meshgrid? There is a very good chance that you really don't need meshgrid because numpy broadcasting can do the same thing without generating a repetitive array.
For example,
import numpy as np
x=np.arange(2)
y=np.arange(3)
[X,Y] = np.meshgrid(x,y)
S=X+Y
print(S.shape)
# (3, 2)
# Note that meshgrid associates y with the 0-axis, and x with the 1-axis.
print(S)
# [[0 1]
# [1 2]
# [2 3]]
s=np.empty((3,2))
print(s.shape)
# (3, 2)
# x.shape is (2,).
# y.shape is (3,).
# x's shape is broadcasted to (3,2)
# y varies along the 0-axis, so to get its shape broadcasted, we first upgrade it to
# have shape (3,1), using np.newaxis. Arrays of shape (3,1) can be broadcasted to
# arrays of shape (3,2).
s=x+y[:,np.newaxis]
print(s)
# [[0 1]
# [1 2]
# [2 3]]
The point is that S=X+Y can and should be replaced by s=x+y[:,np.newaxis] because
the latter does not require (possibly large) repetitive arrays to be formed. It also generalizes to higher dimensions (more axes) easily. You just add np.newaxis where needed to effect broadcasting as necessary.
See http://www.scipy.org/EricsBroadcastingDoc for more on numpy broadcasting.
i think what you want is
X, Y, Z = numpy.mgrid[-10:10:100j, -10:10:100j, -10:10:100j]
for example.
Here is a multidimensional version of meshgrid that I wrote:
def ndmesh(*args):
args = map(np.asarray,args)
return np.broadcast_arrays(*[x[(slice(None),)+(None,)*i] for i, x in enumerate(args)])
Note that the returned arrays are views of the original array data, so changing the original arrays will affect the coordinate arrays.
Instead of writing a new function, numpy.ix_ should do what you want.
Here is an example from the documentation:
>>> ixgrid = np.ix_([0,1], [2,4])
>>> ixgrid
(array([[0],
[1]]), array([[2, 4]]))
>>> ixgrid[0].shape, ixgrid[1].shape
((2, 1), (1, 2))'
You can achieve that by changing the order:
import numpy as np
xx = np.array([1,2,3,4])
yy = np.array([5,6,7])
zz = np.array([9,10])
y, z, x = np.meshgrid(yy, zz, xx)

Categories

Resources