Inspired by the example Plot precip with filled contours from this website I want to make a plot of yesterday's precipitation data, projected onto a map. The example from that website can, however, no longer be used as the data format of the precipitation data has changed.
My approach is as follows:
download the netCDF4-file from the National Weather Service website
open the netCDF4-file and extract the relevant information
create a map with Basemap
project the precipitation data onto the map
I guess my problem is that I do not understand the netCDF4-file format, and in particular the metadata, since the information about the grid origin of the precipitation data must be hidden somewhere in it.
My code looks as follows:
from datetime import datetime, timedelta
import netCDF4
import numpy as np
import matplotlib.pyplot as plt
import os.path
import urllib
from mpl_toolkits.basemap import Basemap
# set date for precipitation (1 day ago)
precip_date = datetime.utcnow() - timedelta(days=1)
precip_fname = 'nws_precip_1day_{0:%Y%m%d}_conus.nc'.format( precip_date )
precip_url = 'http://water.weather.gov/precip/downloads/{0:%Y/%m/%d}/{1}'.format( precip_date, precip_fname )
# download netCDF4-file if it does not exist already
if not os.path.isfile( precip_fname ):
urllib.urlretrieve( precip_url, precip_fname )
# read netCDF4 dataset and extract relevant data
precip_dSet = netCDF4.Dataset( precip_fname )
# spatial coordinates
precip_x = precip_dSet['x'][:]
precip_y = precip_dSet['y'][:]
# precipitation data (is masked array in netCDF4-dataset)
precip_data = np.ma.getdata( precip_dSet['observation'][:] )
# grid information
precip_lat0 = precip_dSet[ precip_dSet['observation'].grid_mapping ].latitude_of_projection_origin
precip_lon0 = precip_dSet[ precip_dSet['observation'].grid_mapping ].straight_vertical_longitude_from_pole
precip_latts = precip_dSet[ precip_dSet['observation'].grid_mapping ].standard_parallel
# close netCDF4 dataset
precip_dSet.close()
fig1, ax1 = plt.subplots(1,1, figsize=(9,6) )
# create the map
my_map = Basemap( projection='stere', resolution='l',
width=(precip_x.max()-precip_x.min()),
height=(precip_y.max()-precip_y.min()),
lat_0=30, # what is the correct value here?
lon_0=precip_lon0,
lat_ts=precip_latts
)
# white background
my_map.drawmapboundary( fill_color='white' )
# grey coastlines, country borders, state borders
my_map.drawcoastlines( color='0.1' )
my_map.drawcountries( color='0.5' )
my_map.drawstates( color='0.8' )
# contour plot of precipitation data
# create the grid for the precipitation data
precip_lons, precip_lats = my_map.makegrid( precip_x.shape[0], precip_y.shape[0] )
precip_xx, precip_yy = my_map( precip_lons, precip_lats )
# make the contour plot
cont_precip = my_map.contourf( precip_xx, precip_yy, precip_data )
plt.show()
This is how the output looks like (yes, for the final plot the color-levels have to be adjusted):
I know that this is a very specific question, so any suggestions/hints are greatly appreciated.
If I understand correctly you are able to make the plot but want hints on adding extras?
xarray is a fantastic toolbox for working with netCDF files. It works like pandas but for netCDF files and is a big improvement on 'netCDF4':
http://xarray.pydata.org/en/stable/
To specify specific contours you can input the levels:
cont_precip = my_map.contourf( precip_xx, precip_yy, precip_data,levels=[10,20,30]) # Edit for exact contours needed
If you wanted you can add a colorbar:
fig1.colorbar(cont_precip,ax=ax1)
Related
I tried to do the tutorial of McKay Johns on YT (reference to the Jupyter Notebook to see the data (https://github.com/mckayjohns/passmap/blob/main/Pass%20map%20tutorial.ipynb).
I understood everything but I wanted to do a little change. I wanted to change plt.plot(...) with:
plt.arrow(df['x'][x],df['y'][x], df['endX'][x] - df['x'][x], df['endY'][x]-df['y'][x],
shape='full', color='green')
But the problem is, I still can't see the arrows. I tried multiple changes but I've failed. So I'd like to ask you in the group.
Below you can see the code.
## Read in the data
df = pd.read_csv('...\Codes\Plotting_Passes\messibetis.csv')
#convert the data to match the mplsoccer statsbomb pitch
#to see how to create the pitch, watch the video here: https://www.youtube.com/watch?v=55k1mCRyd2k
df['x'] = df['x']*1.2
df['y'] = df['y']*.8
df['endX'] = df['endX']*1.2
df['endY'] = df['endY']*.8
# Set Base
fig ,ax = plt.subplots(figsize=(13.5,8))
# Change background color of base
fig.set_facecolor('#22312b')
# Change color of base inside
ax.patch.set_facecolor('#22312b')
#this is how we create the pitch
pitch = Pitch(pitch_type='statsbomb',
pitch_color='#22312b', line_color='#c7d5cc')
# Set the axes to our Base
pitch.draw(ax=ax)
# X-Achsen => 0 to 120
# Y-Achsen => 80 to 0
# Lösung: Y-Achse invertieren:
plt.gca().invert_yaxis()
#use a for loop to plot each pass
for x in range(len(df['x'])):
if df['outcome'][x] == 'Successful':
#plt.plot((df['x'][x],df['endX'][x]),(df['y'][x],df['endY'][x]),color='green')
plt.scatter(df['x'][x],df['y'][x],color='green')
**plt.arrow(df['x'][x],df['y'][x], df['endX'][x] - df['x'][x], df['endY'][x]-df['y'][x],
shape='full', color='green')** # Here is the problem!
if df['outcome'][x] == 'Unsuccessful':
plt.plot((df['x'][x],df['endX'][x]),(df['y'][x],df['endY'][x]),color='red')
plt.scatter(df['x'][x],df['y'][x],color='red')
plt.title('Messi Pass Map vs Real Betis',color='white',size=20)
It always shows:
The problem is that plt.arrow has default values for head_width and head_length, which are too small for your figure. I.e. it is drawing arrows, the arrow heads are just way too tiny to see them (even if you zoom out). E.g. try something as follows:
import pandas as pd
import matplotlib.pyplot as plt
from mplsoccer.pitch import Pitch
df = pd.read_csv('https://raw.githubusercontent.com/mckayjohns/passmap/main/messibetis.csv')
...
# create a dict for the colors to avoid repetitive code
colors = {'Successful':'green', 'Unsuccessful':'red'}
for x in range(len(df['x'])):
plt.scatter(df['x'][x],df['y'][x],color=colors[df.outcome[x]], marker=".")
plt.arrow(df['x'][x],df['y'][x], df['endX'][x] - df['x'][x],
df['endY'][x]-df['y'][x], color=colors[df.outcome[x]],
head_width=1, head_length=1, length_includes_head=True)
# setting `length_includes_head` to `True` ensures that the arrow head is
# *part* of the line, not added on top
plt.title('Messi Pass Map vs Real Betis',color='white',size=20)
Result:
Note that you can also use plt.annotate for this, passing specific props to the parameter arrowprops. E.g.:
import pandas as pd
import matplotlib.pyplot as plt
from mplsoccer.pitch import Pitch
df = pd.read_csv('https://raw.githubusercontent.com/mckayjohns/passmap/main/messibetis.csv')
...
# create a dict for the colors to avoid repetitive code
colors = {'Successful':'green', 'Unsuccessful':'red'}
for x in range(len(df['x'])):
plt.scatter(df['x'][x],df['y'][x],color=colors[df.outcome[x]], marker=".")
props= {'arrowstyle': '-|>,head_width=0.25,head_length=0.5',
'color': colors[df.outcome[x]]}
plt.annotate("", xy=(df['endX'][x],df['endY'][x]),
xytext=(df['x'][x],df['y'][x]), arrowprops=props)
plt.title('Messi Pass Map vs Real Betis',color='white',size=20)
Result (a bit sharper, if you ask me, but maybe some tweaking with params in plt.arrow can also achieve that):
Two sections of my code are giving me trouble, I am trying to get the basemap created in this first section here:
#Basemap
epsg = 6060; width = 2000.e3; height = 2000.e3 #epsg 3413. 6062
m=Basemap(epsg=epsg,resolution='l',width=width,height=height) #lat_ts=(90.+35.)/2.
m.drawcoastlines(color='white')
m.drawmapboundary(fill_color='#99ffff')
m.fillcontinents(color='#cc9966',lake_color='#99ffff')
m.drawparallels(np.arange(10,70,20),labels=[1,1,0,0])
m.drawmeridians(np.arange(-100,0,20),labels=[0,0,0,1])
plt.title('ICESAT2 Tracks in Greenland')
plt.figure(figsize=(20,10))
Then my next section is meant to plot the data its getting from a file, and plot these tracks on top of the Basemap. Instead, it creates a new plot entirely. I have tried rewording the secondary plt.scatter to match Basemap, such as m.scatter, m.plt, etc. But it only returns with “RuntimeError: Can not put single artist in more than one figure” when I do so.
Any ideas on how to get this next section of code onto the basemap? Here is the next section, focus on the end to see where it is plotting.
icesat2_data[track] = dict() # creates a sub-dictionary, track
icesat2_data[track][year+month+day] = dict() # and one layer more for the date under the whole icesat2_data dictionary
icesat2_data[track][year+month+day] = dict.fromkeys(lasers)
for laser in lasers: # for loop, access all the gt1l, 2l, 3l
if laser in f:
lat = f[laser]["land_ice_segments"]["latitude"][:] # data for a particular laser's latitude.
lon = f[laser]["land_ice_segments"]["longitude"][:] #data for a lasers longitude
height = f[laser]["land_ice_segments"]["h_li"][:] # data for a lasers height
quality = f[laser]["land_ice_segments"]["atl06_quality_summary"][:].astype('int')
# Quality filter
idx1 = quality == 0 # data dictionary to see what quality summary is
#print('idx1', idx1)
# Spatial filter
idx2 = np.logical_and( np.logical_and(lat>=lat_min, lat<=lat_max), np.logical_and(lon>=lon_min, lon<=lon_max) )
idx = np.where( np.logical_and(idx1, idx2) ) # combines index 1 and 2 from data quality filter. make sure not empty. if empty all data failed test (low quality or outside box)
icesat2_data[track][year+month+day][laser] = dict.fromkeys(['lat','lon','height']) #store data, creates empty dictionary of lists lat, lon, hi, those strings are the keys to the dict.
icesat2_data[track][year+month+day][laser]['lat'] = lat[idx] # grabbing only latitudes using that index of points with good data quality and within bounding box
icesat2_data[track][year+month+day][laser]['lon'] = lon[idx]
icesat2_data[track][year+month+day][laser]['height'] = height[idx]
if lat[idx].any() == True and lon[idx].any() == True:
x, y = transformer.transform(icesat2_data[track][year+month+day][laser]['lon'], \
icesat2_data[track][year+month+day][laser]['lat'])
plt.scatter(x, y, marker='o', color='#000000')
Currently, they output separately, like this:
Not sure if you're still working on this, but here's a quick example I put together that you might be able to work with (obviously I don't have the data you're working with). A couple things that might not be self-explanatory - I used m() to transform the coordinates to map coordinates. This is Basemap's built-in transformation method so you don't have to use PyProj. Also, setting a zorder in the scatter function ensures that your points are plotted above the countries layer and don't get hidden underneath.
#Basemap
epsg = 6060; width = 2000.e3; height = 2000.e3 #epsg 3413. 6062
plt.figure(figsize=(20,10))
m=Basemap(epsg=epsg,resolution='l',width=width,height=height) #lat_ts=(90.+35.)/2.
m.drawcoastlines(color='white')
m.drawmapboundary(fill_color='#99ffff')
m.fillcontinents(color='#cc9966',lake_color='#99ffff')
m.drawparallels(np.arange(10,70,20),labels=[1,1,0,0])
m.drawmeridians(np.arange(-100,0,20),labels=[0,0,0,1])
plt.title('ICESAT2 Tracks in Greenland')
for coord in [[68,-39],[70,-39]]:
lat = coord[0]
lon = coord[1]
x, y = m(lon,lat)
m.scatter(x,y,color='red',s=100,zorder=10)
plt.show()
I think you might need:
plt.figure(figsize(20,10))
before creating the basemap, not after. As it stands it's creating a map and then creating a new figure after that which is why you're getting two figures.
Then your plotting line should be m.scatter() as you mentioned you tried before.
My name is Luis Francisco Gomez and I am in the course Intermediate Python > 1 Matplotlib > Sizes that belongs to the Data Scientist with Python in DataCamp. I am reproducing the exercises of the course where in this part you have to make a scatter plot in which the size of the points are equivalent to the population of the countries. I try to reproduce the results of DataCamp with this code:
# load subpackage
import matplotlib.pyplot as plt
## load other libraries
import pandas as pd
import numpy as np
## import data
gapminder = pd.read_csv("https://assets.datacamp.com/production/repositories/287/datasets/5b1e4356f9fa5b5ce32e9bd2b75c777284819cca/gapminder.csv")
gdp_cap = gapminder["gdp_cap"].tolist()
life_exp = gapminder["life_exp"].tolist()
# create an np array that contains the population
pop = gapminder["population"].tolist()
pop_np = np.array(pop)
plt.scatter(gdp_cap, life_exp, s = pop_np*2)
# Previous customizations
plt.xscale('log')
plt.xlabel('GDP per Capita [in USD]')
plt.ylabel('Life Expectancy [in years]')
plt.title('World Development in 2007')
plt.xticks([1000, 10000, 100000],['1k', '10k', '100k'])
# Display the plot
plt.show()
However a get this:
But in theory you need to get this:
I don't understand what is the problem with the argument s in plt.scatter .
You need to scale your s,
plt.scatter(gdp_cap, life_exp, s = pop_np*2/1000000)
The marker size in points**2.
Per docs
This is because your sizes are too large, scale it down. Also, there's no need to create all the intermediate arrays:
plt.scatter(gapminder.gdp_cap,
gapminder.life_exp,
s=gapminder.population/1e6)
Output:
I think you should use
plt.scatter(gdp_cap, life_exp, s = gdp_cap*2)
or maybe reduce or scale pop_np
I try to convert Lambert conformal coordinates to lat/lon (WGS84) and I have used wgrib2, but the result is biased.
Command:
wgrib2 "mypath" -match "10m...." -new_grid_winds grid -new_grid_interpolation neighbor -new_grid latlon 108:129:0.25 16:65:0.25 "outputpath"
results with:
while it should be like that (from windy.com)
grib file:
Grib2 file
Grib2json file
I think there might be some flaws in the initial grib file. I converted the grib file to netCDF using wgrib2 and after that made some plots using Python and the result is not good.
Thing is, when I make the plot of temperature and overlay that with wind vectors, it looks ok. Problem is, when I also add the coastline, I see that the location of Taiwan island and also the main continent does not match with coastline drawn from the database.
Therefore I assume there is something bad in the initial gribfile - either the coordinates (start and endpoint or the step) are not very good and the coordinates written to the netCDF are not correct.
My code is here, if interested:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
from netCDF4 import Dataset
import json
# -------------------------------
# read the json file:
with open('2018091312.json','r') as f:
data = json.load(f)
# -------------------------------
lo1,lo2,la1,la2 = 108,142.75,16,23.75
dx,dy=0.25,0.25
nx,ny=140,32
udata=np.array(data[0]['data'],dtype='float32');udata=np.reshape(udata,(ny,nx));
vdata=np.array(data[1]['data'],dtype='float32');vdata=np.reshape(vdata,(ny,nx));
londata=np.arange(lo1,lo2+dx,dx);
latdata=np.arange(la1,la2+dy,dy);
londata,latdata=np.meshgrid(londata,latdata)
# -------------------------------
# -------------------------------
ncin=Dataset('test.nc');
lons=ncin.variables['longitude'][:];
lats=ncin.variables['latitude'][:];
u10=np.squeeze(ncin.variables['UGRD_10maboveground'][:])
v10=np.squeeze(ncin.variables['VGRD_10maboveground'][:])
t2=np.squeeze(ncin.variables['TMP_surface'][:])
ncin.close();
# -------------------------------
xlim=(np.min(lons),np.max(lons));
ylim=(np.min(lats),np.max(lats));
# -------------------------------
plt.figure(figsize=(8, 8))
m = Basemap(projection='cyl', resolution='i',
llcrnrlat=ylim[0], urcrnrlat=ylim[1],
llcrnrlon=xlim[0], urcrnrlon=xlim[1], )
xx,yy=m(lons,lats);
m.pcolormesh(lons,lats,t2,vmin=273.,vmax=300.);
skipx=skipy=16
m.quiver(xx[::skipy,::skipx],yy[::skipy,::skipx],u10[::skipy,::skipx],v10[::skipy,::skipx],scale=20.0,units='inches');
# ------------------------------------------
plt.savefig('test_withoutland.png',bbox_inches='tight')
m.drawcoastlines()
m.drawlsmask(land_color = "#ddaa66")
plt.savefig('test_withland.png',bbox_inches='tight')
plt.show()
# ------------------------------------------
skipx,skipy=2,2
plt.figure(figsize=(8, 8))
m = Basemap(projection='cyl', resolution='i',
llcrnrlat=ylim[0], urcrnrlat=ylim[1],
llcrnrlon=xlim[0], urcrnrlon=xlim[1], )
xx,yy=m(londata,latdata);
m.pcolormesh(lons,lats,t2,vmin=273.,vmax=300.);
m.quiver(xx[::skipy,::skipx],yy[::skipy,::skipx],udata[::skipy,::skipx],vdata[::skipy,::skipx],scale=20.0,units='inches');
# ------------------------------------------
m.drawcoastlines()
m.drawlsmask(land_color = "#ddaa66")
plt.savefig('test_json.png',bbox_inches='tight')
plt.show()
And the result looks like this (the test with JSON file):
The conversion from grib to newCDF, I did like this:
wgrib2 M-A0064-000.grb2 -netcdf test.nc
There are some weird definitions in the WRF LCC that you need to keep in mind when doing your reprojections. This website (unaffiliated) details most of it using python.
https://fabienmaussion.info/2018/01/06/wrf-projection/
I am currently working with BUFR files with wind data. When I read this file on python I get 4 large vectors, latitude vector, longitude vector, wind_direction vector, and wind_speed vector.
Both wind vectors are masked python arrays because there is non-valid data. This happens because the data comes from a non-geostationary satellite. In fact I successfully generated the following image from this BUFR file to show you the general shape that the data takes.
In this image I have plotted a color field to represent the wind speed, while the arrows obviously represent the wind direction.
Please notice the two bands of actual data. Unfortunately the way I am plotting the data, generates a third band (where the color field is smooth), in-between the actual data bands. This is an artefact of the function pcolormesh. If I could superimpose two `pcolormesh plots, each one representing one of the bands, this problem would disappear.
Unfortunately, I do not know how I could separate the data "regions". I have thought about clustering techniques but do not know how to cluster along latlon data using ANOTHER array (the wind data) as the clustering rule.
This is my current code:
#!/usr/bin/python
import bufr
import numpy as np
import sys
import matplotlib
matplotlib.use('Agg')
from matplotlib import pyplot as plt
from matplotlib import mlab
WIND_DIR_INDEX = 97
WIND_SPEED_INDEX = 96
bfrfile = sys.argv[1]
print bfrfile
bfr = bufr.BUFRFile(bfrfile)
lon = []
lat = []
wind_d = []
wind_s = []
for record in bfr:
for entry in record:
if entry.index == WIND_DIR_INDEX:
wind_d.append(entry.data)
if entry.index == WIND_SPEED_INDEX:
wind_s.append(entry.data)
if entry.name.find("LONGITUDE") == 0:
lon.append(entry.data)
if entry.name.find("LATITUDE") == 0:
lat.append(entry.data)
lons = np.concatenate(lon)
lats = np.concatenate(lat)
winds_d = np.concatenate(wind_d)
winds_s = np.concatenate(wind_s)
winds_d = np.ma.masked_greater(winds_d,1.0e+6)
winds_s = np.ma.masked_greater(winds_s,1.0e+6)
windu = np.cos((winds_d-180)*(np.pi/180))
windv = np.sin((winds_d-180)*(np.pi/180))
# Data interpolation for pcolormesh (needs gridded data)
xi = np.linspace(lons.min(),lons.max(),lons.size/10)
yi = np.linspace(lats.min(),lats.max(),lats.size/10)
Z = mlab.griddata(lons,lats,winds_s,xi,yi)
X,Y = np.meshgrid(xi,yi)
mydpi = 96
fig = plt.figure(frameon=True)
fig.set_size_inches(1600/mydpi,1200/mydpi)
ax = plt.Axes(fig,[0,0,1,1])
#ax.set_axis_off()
fig.add_axes(ax)
plt.hold(True);
plt.quiver(lons[::5],lats[::5],windu[::5],windv[::5],linewidths=0)
for method in (ax.set_xticks,ax.set_xticklabels,ax.set_yticks,ax.set_yticklabels):
method([])
fig.savefig('/home/cendas/bin/python/bufr_ascat.png',bbox_inches=0,dpi=5*mydpi)
mydpi = 96
fig = plt.figure(frameon=True)
fig.set_size_inches(1600/mydpi,1200/mydpi)
ax = plt.Axes(fig,[0,0,1,1])
#ax.set_axis_off()
fig.add_axes(ax)
plt.hold(True);
try:
plt.pcolormesh(X,Y,Z,alpha=None)
plt.clim(0,10)
except ValueError:
pass
print "Warning: Empty data array."
for method in (ax.set_xticks,ax.set_xticklabels,ax.set_yticks,ax.set_yticklabels):
method([])
fig.savefig('/home/cendas/bin/python/bufr_ascat_color.png',bbox_inches=0,dpi=5*mydpi)
I then usually follow this python code with the following terminal commands to combine the images:
convert bufr_ascat.png -transparent white bufr_ascat.png
convert bufr_ascat_color.png -transparent white bufr_ascat_color.png
composite bufr_ascat.png bufr_ascat_color.png bufrascat.png
Don't abuse clustering for this.
What you need is a simple selection / filtering; not a structure discovery process.
Choose the mean of the masked data. All non-masked data left of that mean is the left part, all non-masked data on the right is the other?
Clustering is the wrong tool for this task.