Plotting dataset using griddata without cancelling outliers - python

I have an x, y, z dataset which contains a rather large number of points.
x and y are the positions while z is the actual observable at those coordinates.
Most coordinates have a zero value for z, while only a few of them define lines (with smoothly changing z) in the 2D map.
If I plot it with
scatter(x,y,c=z))
I get only very faint lines as the scatterpoints with color defined by z=0 are overlapping with the nonzero values of z. If I decrease the size of the points to reduce overlap, I can't see them anymore.
Here an example of the best I could get using scatter (blue is zero z, other colors are non-zero z).
So, I thought of instead using
data = np.genfromtxt('data')
x=data[:,0]
y=data[:,1]*3.0
z=data[:,2]
grid_x, grid_y = np.mgrid[min(x):max(x):100, min(y):max(y):1000]
from scipy.interpolate import griddata
grid_z0 = griddata((x, y),z, (grid_x, grid_y), method='cubic')
im = imshow(grid_z0,origin="lower",extent=[0,0.175,-0.15,0.15]) # zoom in on specific part of data
to get a denser grid of points and possibly get wider lines due to the cubic interpolation of points around them.
However, then it seems like griddata is removing the non-zero z, considering them as outliners, thus hiding any possible features and the whole grid plots a zero z.
Is there any python/matplotlib/... feature or trick I am missing to plot this in a nice way?
I am trying to make plots that would look something like the ones you can see in Fig. 2 (2) of [https://journals.aps.org/prb/abstract/10.1103/PhysRevB.93.0854092 (you can see the figure without downloading the paper) with possibly some kind of glow around the lines.
The data I used is in this dropbox link.

Of course you may change the scatter, e.g. to set the size of the points without energy to 0.
import matplotlib.pyplot as plt
import numpy as np
data = np.genfromtxt('data/some_solidstate_physics_data.txt')
x=data[:,0]; y=data[:,1]*3.0; z=data[:,2]
plt.scatter(x,y,c=z, s=np.log10(z+1), cmap="PuRd", vmin=-500)
plt.show()
Since the data is already gridded, there is for sure no need to use griddata, this will only smooth out the data. Instead just reshaping the data into a grid is enough.
import matplotlib.pyplot as plt
import numpy as np
data = np.genfromtxt('data/some_solidstate_physics_data.txt')
x=data[:,0]; y=data[:,1]*3.0; z=data[:,2]
ux = np.unique(x); uy = np.unique(y)
Z = z.reshape(len(ux),len(uy)).T
dx = np.diff(ux[:2])[0]; dy = np.diff(uy[:2])[0]
ext = [ux.min()-dx/2.,ux.max()+dx/2.,uy.min()-dy/2., uy.max()+dy/2.]
plt.imshow(Z, extent=ext, aspect="auto", cmap="magma")
plt.show()
Since the grid is very dense, it looks somehow pixelated.
You may of course also bin your data into larger chunks. For example joining the data of 3x3 pixels into one and taking the maximum value, using scipy.stats.binned_statistic_2d
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import binned_statistic_2d
data = np.genfromtxt('data/some_solidstate_physics_data.txt')
x=data[:,0]; y=data[:,1]*3.0; z=data[:,2]
ux = np.unique(x); uy = np.unique(y)
h, ex, ey,_ = binned_statistic_2d(x, y, z, bins=[ux[::3],uy[::3]], statistic='max')
dx = np.diff(ex[:2])[0]; dy = np.diff(ey[:2])[0]
ext = [ux.min()-dx/2.,ux.max()+dx/2.,uy.min()-dy/2., uy.max()+dy/2.]
plt.imshow(h.T, extent=ext, aspect="auto", cmap="magma")
plt.show()
Having those techniques at your disposal you may then decide to beautify your result at the expense of quantitative accuracy.
E.g. applying a gaussian filter, scipy.ndimage.filters.gaussian_filter as well as interpolation="gaussian" in the plotting.
import matplotlib.pyplot as plt
import numpy as np
import scipy.ndimage.filters
data = np.genfromtxt('data/some_solidstate_physics_data.txt')
x=data[:,0]; y=data[:,1]*3.0; z=data[:,2]
ux = np.unique(x); uy = np.unique(y)
Z = z.reshape(len(ux),len(uy)).T
Z = scipy.ndimage.filters.gaussian_filter(Z, 3)
dx = np.diff(ux[:2])[0]; dy = np.diff(uy[:2])[0]
ext = [ux.min()-dx/2.,ux.max()+dx/2.,uy.min()-dy/2., uy.max()+dy/2.]
plt.imshow(Z, extent=ext, aspect="auto", cmap="magma", interpolation="gaussian")
plt.show()

Related

Python: Creating a Grid of X,Y coordinates and corresponding calculated Z values to result in a 3D array of XYZ

I have a function that calculates a z value from a given x and y coordinate. I then want to combine these values together to get a 3D array of x,y,z. I'm attempting to do this with the code below:
#import packages
import pandas as pd
import math
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.tri as tri
import matplotlib.pyplot as plt
from matplotlib import rcParams
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
from mpl_toolkits.mplot3d import Axes3D
#Define function to calculate z over a grid
def func(X, Y, x, y, Q):
return (Q / (2 * np.pi)) * np.arctan((y-Y)/(x-X))
#For initial testing just defining the IW explicitly, last step will be to read the input file and pull this data
X1=2417743.658
Y1=806346.704
Q1=5
X2=2417690.718
Y2=806343.693
Q2=5
X3=2417715.221
Y3=806309.685
Q3=5
#initiate the XY grid
xi = np.linspace(2417675,2417800,625)
yi = np.linspace(806300,806375,375)
#mesh the grid in to x,y space
x,y = np.meshgrid(xi,yi)
#calculate the values over the grid at every x,y using the defined function above
zi = (func(X1,Y1,x,y,Q1)+func(X2,Y2,x,y,Q2)+func(X3,Y3,x,y,Q3))
#reshape the xy space into 3d space - when i plot this grid it looks correct
xy = np.array([[(x, y) for x in xi] for y in yi])
#reshape z into 3d space - this appears to be where the issue begins
z = np.array(zi).reshape(xy.shape[0],xy.shape[1], -1)
#combined xyz into a single grid
xyz = np.concatenate((xy, z), axis = -1)
# Create figure and add axis
fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111)
img = ax.imshow((xyz*255).astype(np.uint8))
output:
I do get an XYZ array and when i print it the values appear to be mapping correctly, however when I plot the data, it shows the y values "upside down" essentially. This is what the output should look like but "flipped" over the x over axis. Additionally the axes show node numbers and not the X,Y values. I want the 0,0 point to be the lower left hand corner like cartesian coordinates, and each x,y have a corresponding z which is calculated from that given x,y. I know there must be an easier way to go about this. Does anyone know a better way? or maybe what i'm doing wrong here?
Thanks
There is an option for ax.imshow() that allows to specify the origin point.
https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.imshow.html
origin{'upper', 'lower'}, default: rcParams["image.origin"] (default:
'upper') Place the [0, 0] index of the array in the upper left or
lower left corner of the Axes. The convention (the default) 'upper' is
typically used for matrices and images.
Note that the vertical axis points upward for 'lower' but downward for
'upper'.
See the origin and extent in imshow tutorial for examples and a more
detailed description.
Try to modify to this:
img = ax.imshow((xyz*255).astype(np.uint8), origin='lower')
For the axis labels they can be changed with the following commands
ax.set_xticks(LIST_OF_INDICIES)
ax.set_xticklabels(LIST_OF_VALUES)

Plotting 2D scalar velocity field with matplotlib

I have the following dataframe which I'm trying to plot,
x,y,u,v
-0.157806993154554,-0.05,0.000601310515776,0.003318849951029
-0.374687807296859,-0.35,-0.001057069515809,2.9686838388443E-05
-1,-0.323693574077183,-0.002539682900533,-0.008748378604651
-0.486242955499287,-0.35,-0.001797694480047,0.000218112021685
-0.54184300562917,-0.05,0.001513708615676,0.001884449273348
0,-0.31108016382718,5.28732780367136E-05,-0.000818025320768
-0.428046308037431,-0.35,-0.001458290731534,8.22432339191437E-05
-0.343159653530217,-0.05,0.00112508633174,0.002580288797617
-0.386254219645565,-0.35,-0.001139726256952,2.6945024728775E-05
-0.600252053226546,-0.05,0.001246933126822,0.00207519903779
-1,-0.061575842243108,-0.000705834245309,0.043682213872671
0,-0.052056831172645,0.009899478405714,-0.003894355148077
-0.903283837058102,-0.35,5.81557396799326E-05,-0.001065131276846
-0.418202966058798,-0.05,0.001158628845587,0.002409461885691
-0.809266339501268,-0.35,0.000290673458949,-2.0977109670639E-05
0,-0.066616962597653,2.92772892862558E-05,0.001737955957651
-0.090282152608,-0.35,0.00151876010932,0.001403901726007
-1,-0.173440678035212,-0.007741978392005,0.006023477762938
-1,-0.155079864747918,-0.00761691480875,0.007886063307524
-0.222728396757266,-0.35,0.000686463201419,0.000264558941126
where u,v and x,y are positional coordinates and the velocity vectors at that point. (full dataset - https://pastebin.pl/view/0f60b48e)
I want to plot my data like so (Contour lines and arrows are not required.) .
How do I do this?
So far I've tried:
import numpy as np
import matplotlib.pyplot as plt
# Meshgrid
#x, y = np.meshgrid(box_df['x'], box_df['y'])
x,y = box_df['x'], box_df['y']
# Directional vectors
#u, v = np.meshgrid(box_df['u'], box_df['v'])
u = box_df['u']
v = box_df['v']
# Plotting Vector Field with QUIVER
plt.quiver(x, y, u, v, color='g')
plt.title('Vector Field')
# Show plot with gird
plt.grid()
If you want to plot a scalar field with irregular data points, you can either interpolate between data points to form a regular grid, or you can use matplotlib.pyplot.tricontour and tricontourf to interpolate for you to fill.
Using tricontour you could try:
import numpy as np
import matplotlib.pyplot as plt
x, y = box_df.x, box_df.y
# make scalar field
speed = np.hypot(box_df.u, box_df.v)
# Plotting scalar field with tricontour
plt.tricontourf(x, y, speed)
plt.title('Scalar Field')
# Show plot with gird
plt.grid()
However it appears that you only have data around the edge of a rectangle, so interpolation into the interior of the rectangle is likely to be poor.

Two dimensional distribution with contour plotting with Gaussian smoothing

I am trying to plot Two-dimensional distribution with Gaussian smoothing for all the data points. Here is my sample data. I would like to plot like this paper figure.
[Two-dimensional distribution of Hα emitters(galaxies). The small circles indicate the positions of Hα emitters. The colored contours indicate 1 σ, 1.5 σ, 2 σ, 3 σ, 4 σ, 5 σ above the mean density distribution computed with all member galaxies (i.e., the photo-z-selected sample and Hα emitters). Here we apply Gaussian smoothing for all the data points with σ ∼ 0.75 Mpc, and co-add the tail of the Gaussian wing at each position. The large gray circles show the object masks; we show only large masks with radius of >2΄ for clarity. (Color online)]
I am trying to plot the same plot with my data. But I am not able to plot like this.
Here is my code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.interpolate import griddata
import scipy as sp
import scipy.ndimage
from astropy.table import Table
import branca
import pandas as pd
%matplotlib inline
#DATA
data=Table.read('s_data.fits')
density_mean=np.mean(data['density'])
std=np.std(data['density'])
vmin = density_mean+std
vmax = density_mean+(5*std)
levels = len(colors)
cm = branca.colormap.LinearColormap(colors, vmin=vmin, vmax=vmax).to_step(levels)
x_orig = data['ra']
y_orig = data['dec']
z_orig = data['density']
# Make a grid
x_arr = np.linspace(np.min(x_orig), np.max(x_orig), 500)
y_arr = np.linspace(np.min(y_orig), np.max(y_orig), 500)
#WHEN I CHANGE THE VALUE FROM 500 to ANOTHER IT DOESN'T COVER ALL THE AREA. WHAT WILL BE THE APPROPRIATE VALUE FOR THIS?
x_mesh, y_mesh = np.meshgrid(x_arr, y_arr)
# Grid the values
z_mesh = griddata((x_orig, y_orig), z_orig, (x_mesh, y_mesh), method='linear')
# Gaussian filter the grid to make it smoother
sigma = [5, 5]
z_mesh = sp.ndimage.filters.gaussian_filter(z_mesh, sigma, mode='constant')
# Create the contour
plt.plot(data['ra'], data['dec'], 'ko', markersize=1,alpha=1)
plt.text(352.33,0.135, 'CL1',fontsize=15) #CL1 coordinate
plt.text(352.096,0.390, 'CL2', fontsize=15) #CL2 coordinate
plt.contourf(x_mesh, y_mesh, z_mesh, levels, alpha=0.7, colors=colors, linestyles='None',
vmin=vmin, vmax=vmax)
#plt.colorbar()
plt.gca().invert_xaxis()
And I got like this
I know something is wrong with my data. Kindly help to get the proper figure like given in the paper. Thank you.

Python: Plot residuals on a fitted model

I want to plot the lines (residuals; cyan lines) between data points and the estimated model. Currently I'm doing so by iterating over all data points in my income pandas.DataFrame and adding vertical lines. x, y are the points' coordinates and predicted are the predictions (here the blue line).
plt.scatter(income["Education"], income["Income"], c='red')
plt.ylim(0,100)
for indx, (x, y, _, _, predicted) in income.iterrows():
plt.axvline(x, y/100, predicted/100) # /100 because it needs floats [0,1]
Is there a more efficient way? This doesn't seem like a good approach for more than a few rows.
First of all note that axvline here only works by coincidence. In general the y values taken by axvline are in coordinates relative to the axes, not in data coordinates.
In contrast, vlines uses data coordinates and also has the advantage to accept arrays of values. It will then create a LineCollection, which is more efficient than individual lines.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-1.2,1.2,20)
y = np.sin(x)
dy = (np.random.rand(20)-0.5)*0.5
fig, ax = plt.subplots()
ax.plot(x,y)
ax.scatter(x,y+dy)
ax.vlines(x,y,y+dy)
plt.show()

Making a contour plot with solutions from systems of differential equations with pylab

So, I'm solving a system of differential equations numerically i have x,y,z each a solution. Each array is one dimensional and and for example x[0],y[0],z[0] goes with a point in space. i want to graph these in a contour like the usual x y z coordinates, it says i need z to be a 2d array, i know how to make a mesh from x and y, but how do i do this to z?
I have made a mesh out of the x,y, but for z i don't know what to do.
if someone could give me insight it would be much appreciated.
It is not enough to just mesh in x and y, you need to grid your data on a regular grid to be able to do a contour plot. To do this you should look into matplotlib.mlab.griddata (http://matplotlib.org/examples/pylab_examples/griddata_demo.html).
I'll paste the example code from the link below with some extra comments:
from numpy.random import uniform, seed
from matplotlib.mlab import griddata
import matplotlib.pyplot as plt
import numpy as np
# Here the code generates some x and y coordinates and some corresponding z values.
seed(0)
npts = 200
x = uniform(-2,2,npts)
y = uniform(-2,2,npts)
z = x*np.exp(-x**2-y**2)
# Here you define a grid (of arbitrary dimensions, but equal spacing) onto which your data will be mapped
xi = np.linspace(-2.1,2.1,100)
yi = np.linspace(-2.1,2.1,200)
# Map the data to the grid to get a 2D array of remapped z values
zi = griddata(x,y,z,xi,yi,interp='linear')
# contour the gridded data, plotting dots at the nonuniform data points.
CS = plt.contour(xi,yi,zi,15,linewidths=0.5,colors='k')
CS = plt.contourf(xi,yi,zi,15,cmap=plt.cm.rainbow,
vmax=abs(zi).max(), vmin=-abs(zi).max())
plt.colorbar() # draw colorbar
# Plot the original sampling
plt.scatter(x,y,marker='o',c='b',s=5,zorder=10)
plt.xlim(-2,2)
plt.ylim(-2,2)
plt.title('griddata test (%d points)' % npts)
plt.show()
It looks like you are looking for line or scatter plots instead of contour.

Categories

Resources