Interpolation causes issues with plotting in xarray - python

I'm trying to do some analysis of CMIP6 climate model data, and I want to produce some multi-model ensemble plots. In order to do that I need to interpolate the data to a common grid. The interpolation of the data seems to work fine, but whenever I try to plot the interpolated data it seems like there's a gap in the data at the prime meridian and at the poles. Here's a simple example where I interpolate MPI-ESM data to the grid of CESM2-WACCM:
import xarray as xr
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
cesm2_waccm = xr.open_dataset('pr_day_CESM2-WACCM_ssp245_r2i1p1f1_gn_20750101-20841231.nc')
mpi = xr.open_dataset('pr_day_MPI-ESM1-2-LR_ssp245_r1i1p1f1_gn_20750101-20941231.nc')
cesm2_waccm_subset = cesm2_waccm.sel(time=slice('2075-01-01', '2075-12-31')).mean(dim='time')
mpi_subset = mpi.sel(time=slice('2075-01-01', '2075-12-31')).mean(dim='time')
map_proj = ccrs.PlateCarree()
# This works.
plot = mpi_subset.pr.plot(subplot_kws={'projection': map_proj})
plot.axes.coastlines()
plt.show()
mpi_interp = mpi_subset.interp(lat=cesm2_waccm_subset['lat'], lon=cesm2_waccm_subset['lon'])
# This has a white line at prime meridian.
plot = mpi_interp.pr.plot(subplot_kws={'projection': map_proj})
plot.axes.coastlines()
plt.show()
As you can see, the first plot is fine:
And the second plot has a white line at the prime meridian and at the poles:
Is there anything I can do to get rid of that line?
Also, here are the versions of the packages I have installed which I think are relevant to this question:
xarray 0.20.1
cartopy 0.20.1
matplotlib 3.3.2

Related

Polar pcolormesh shifts center when used set_ylim in matplotlib

Although I am providing a excerpt of the code I am using, but this piece contains the problem I am facing. I am trying to plot density of the particles over the disc and hence polar plot seems natural to use. So I have used following piece of code to read a density file which contains the density with rows and column representing radius and angular direction.
#! /usr/bin/env python
import numpy as np
import matplotlib.pyplot as plt
from os.path import exists
from os import sys
import matplotlib as mpl
from matplotlib import rc
NUMBINS=100
rmax=20.0
dR2=rmax*rmax/NUMBINS
density = np.random.random((NUMBINS, NUMBINS))
r = np.sqrt(np.arange(0,rmax*rmax,dR2) )[:NUMBINS]
theta = np.linspace(0,2*np.pi,NUMBINS)
mpl.rcParams['legend.fontsize'] = 10
mpl.rcParams['pcolor.shading'] ='nearest'
fig = plt.figure(figsize=(5, 5))
ax1 = plt.subplot(111,projection="polar")
rad, th = np.meshgrid(r,theta)
ax1.set_yticks(np.arange(0,rmax,3))
ax1.pcolormesh(th,rad,density,cmap='Blues')
#ax1.set_ylim([rad[0,0], rad[0,NUMBINS-1]])
plt.tight_layout()
plt.show()
which gives me the following plot :
As you can see that the radius starts from 0 to rmax, removing the commented line
ax1.set_ylim([rad[0,0], rad[0,NUMBINS-1]])
shall not have any effect on the plot but it shifts the center of the plot :
I don't understand why setting ymin=0 creates this white space in the center?
Turns out that it is a problem with version of matplotlib. I tried a different version and the plot works as expected. Apologies for not trying it earlier.

4D Density Plot in Python

I am looking to plot some density maps from some grid-like data:
X,Y,Z = np.mgrids[-5:5:50j, -5:5:50j, -5:5:50j]
rho = np.random.rand(50,50,50) #for the sake of argument
I am interested in producing an interpolated density plot as shown below, from Mathematica here, using Python.
Is there any solution in Matplotlib or another plotting suite for this sort of plot?
To be clear, I do not want a scatterplot of coloured points, which is not suitable the plot I am trying to make. I would like a 3D interpolated density plot, as shown below.
Plotly
Plotly Approach from https://plotly.com/python/3d-volume-plots/ uses np.mgrid
import plotly.graph_objects as go
import numpy as np
X, Y, Z = np.mgrid[-8:8:40j, -8:8:40j, -8:8:40j]
values = np.sin(X*Y*Z) / (X*Y*Z)
fig = go.Figure(data=go.Volume(
x=X.flatten(),
y=Y.flatten(),
z=Z.flatten(),
value=values.flatten(),
isomin=0.1,
isomax=0.8,
opacity=0.1, # needs to be small to see through all surfaces
surface_count=17, # needs to be a large number for good volume rendering
))
fig.show()
Pyvista
Volume Rendering example:
https://docs.pyvista.org/examples/02-plot/volume.html#sphx-glr-examples-02-plot-volume-py
3D-interpolation code you might need with pyvista:
interpolate 3D volume with numpy and or scipy

How can I change the values on Y axis of Histogram plot in Python

I have data in the CSV file. I am trying to plot a histogram using matplotlib.
Here is the code that I am trying.
data.hist(bins=10)
plt.ylabel('Frequency')
plt.xlabel('Data')
plt.show()
This is the plot that I get.
Now using the same code, I need to create a normalized histogram that shows the probability distribution of the data. But now on the y-axis, instead of plotting the number of data points that fall in each bin, you will plot the number of data points in that data bin divided by the total number of data points.
How should I do it?
Pandas' histogram adds some functionality to the underlying pyplot.hist(). Many of the parameters are passed through. One of them is density=.
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
data = pd.DataFrame(np.random.uniform(258.1, 262.3, 20))
data.hist(bins=10, density=True)
plt.ylabel('Density')
plt.xlabel('Data')
plt.show()
A related library, seaborn, has a command to create a density histogram together with a kde curve as an approximation of the probability distribution.
import seaborn as sns
sns.distplot(data, bins=10)

How to stop numpy trendline from going below 0 on matplotlib graph

I am creating several scatter plot graphs in matplotlib. For these I want to plot trend lines for the scatter plots. I am using the numpy polyfit and poly1d methods to create the trendline.
My problem is as follows: There are only positive y values in my dataset (I have also removed all 0 values), but my trendlines are going below 0. The reason why I think it's going below 0 is that I have some very large outlier values that skew the trendline.
Is there a way I can prevent my graph trendlines from going below 0 without removing data points? Perhaps using a method or parameter for a method in the numpy or matplotlib libraries?
Removing outliers helps some trendlines, but not at all for the multiple graphs I'm making.
Graph example with scatter points: https://imgur.com/a/bwIFJw7
Graph example without scatter points (same data as above graph): https://imgur.com/a/k5TyNjt
Changing the degree of the trend line doesn't solve the issue
code for reproduce-ability:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
import numpy as np
plt.figure(figsize=(20,150))
loc = mdates.AutoDateLocator()
dataset = {'time':['4/5/2014','4/10/2014','4/21/2014','5/3/2014','5/8/2014','5/19/2014','6/7/2014','6/12/2014','6/16/2014','12/6/2014','12/11/2014','12/15/2014','2/7/2015','2/12/2015','2/16/2015','7/20/2015','8/1/2015','8/13/2015','8/17/2015,'9/5/2015','9/10/2015','9/21/2015','10/3/2015','12/10/2015','1/18/2016','8/6/2016','8/11/2016','8/15/2016','9/3/2016','9/8/2016','9/19/2016','10/1/2016','10/13/2016','10/17/2016','11/10/2016','11/5/2016','8/10/2017','9/14/2017','9/18/2017','10/7/2017','2/8/2018','2/19/2018','3/3/2018','3/8/2018','3/19/2018','4/12/2018','4/7/2018','4/16/2018','5/5/2018','5/10/2018','5/21/2018','11/3/2018','11/8/2018','11/19/2018','12/1/2018','12/13/2018','12/17/2018','1/5/2019','1/10/2019','1/21/2019','2/2/2019','2/14/2019','2/18/2019','3/2/2019','3/14/2019','3/18/2019','4/6/2019','4/11/2019','4/15/2019'],'yval':[1714.6,996.32,1638.4,1293.47,744.73,1843.2,1009.97,2168.47,819.2,2949.12,2730.67,2106.51,14745.6,3880.42,73728,792.77,538.16,585.14,571.53,580.54,933.27,460.8,646.74,4336.94,36864,190.51,206.89,199.02,197.54,219.84,210.27,223.75,201.96,212.23,223.6,211.48,1568.68,418.91,837.82,5671.38,217.18,189.74,192.59,192.04,196.74,197.8,196.47,200.69,193.69,210.79,349.42,222.5,209.17,191.37,192.91,197.57,207.23,192.48,189.7,199.44,187.57,186.85,187.99,189.19,196.34,196.11,192.61,196.39,190.05,]}
dataset['time'] = pd.to_datetime(dataset['time'])
dataset['yval'] = pd.to_numeric(dataset['yval'])
x = mdates.date2num(dataset['time'])
y = dataset['yval']
z = np.polyfit(x,y,3)
p = np.poly1d(z)
plt.plot(x,p(x),'#00FFFF', label = type)
plt.title(type)
plt.xlabel('Time')
plt.ylabel('Weight')
#comment out the next line to see plot without scatter points
plt.scatter(x,y)
plt.gca().xaxis.set_major_locator(loc)
plt.gca().xaxis.set_major_formatter(mdates.AutoDateFormatter(loc))
plt.grid(which='major',axis='both')
plt.show()
Graph with trendline not going below the horizontal 0 axis is the desired output

pcolormesh draws not points but lines between data points

I need to plot data of rain summas (from satellite observations) onto a map from grib2 files. Finally I managed to load the data via text files into numpy arrays and tie it with picture coordinates using Basemap. But problem is that Python do not put the coloured points from the data, but tends to draw lines between neighbouring points in data field, so the plot looks ugly.
I do not see the source of the problem.
Fragments of my code are:
import numpy as np
import matplotlib
matplotlib.use('Agg')
from scipy import *
from pylab import *
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import cm
After that I read the data needed and produce 3 numpy arrays with shapes approximately (100000, 2), which contain respectively latitude, longitude, in degrees and the value of each data point. I visualize it using these commands:
def joonista(lats,lons,value,nimi,clevs,koloriit):
---------fragment of described reshaping (not shown), produces arrays "lats", "lons", "value"------------
map=Basemap(projection='aea',lat_1=30,lat_2=50,lat_0=45,lon_0=0,llcrnrlon=-30,llcrnrlat=20,urcrnrlon=80,urcrnrlat=53,resolution='l',)
x, y = map(lons, lats)
map.drawcoastlines(linewidth=0.17,color='0.7')
map.drawcountries(linewidth=0.17,color='0.7')
map.drawmeridians(np.arange(-50,60,5),linewidth=0.17,color='0.7',labels=[False,False,False,True])
map.drawparallels(np.arange(-25, 70, 5),linewidth=0.17,color='0.7',labels=[True,False,False,False])
varvid=mpl.colors.ListedColormap(koloriit)
norm = mpl.colors.BoundaryNorm(clevs,varvid.N)
cs = map.pcolormesh(x,y,value,cmap=varvid,norm=norm)
savefig(nimi,dpi=300)
plt.clf()
joonista(latA,lonA,valueA,'h05',[-1,0.00001,0.001,0.01,0.1,0.3,0.5,1,2,3,4,5,6,7,8,9,10,11,12,13],['k','c','#a0fff9','#00b354','#69b300','#97ff03','#C2524D','#FF7500','#b3a900','#fff551','#515bff','#45adff','#da000d','#ff2a36','#ffa0a5','#f003ff','#f778ff','0.5','0.75'])
joonista(latB,lonB,valueB,'h04',[-1,0.0000000000001,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18],['k','c','#a0fff9','#00b354','#69b300','#97ff03','#C2524D','#FF7500','#b3a900','#fff551','#515bff','#45adff','#da000d','#ff2a36','#ffa0a5','#f003ff','#f778ff','0.5','0.75'])
Here is an example picture:
I would be grateful, if I am told how to solve this problem.
Aleksei
Using Joe Kington recommendation, I replaced command
cs=map.pcolormesh(x,y,value,cmap=varvid,norm=norm)
by command
cs=plt.scatter(x,y,c=value,s=0.6, edgecolors='none',marker=',',cmap=varvid,norm=norm)
which well visualises precipitation distribution.
Thanks for assistance!

Categories

Resources