Python: Overlaying shapefile with np.array data - python

I have a shapefile of the United states, and I have an m x n array of Cartesian data that represents temperature at each pixel. I am able to load in the shapefile and plot it:
import shapefile as shp
import matplotlib.pyplot as plt
sf = shp.Reader("/path/to/USA.shp")
plt.figure()
for shape in sf.shapeRecords():
for i in range(len(shape.shape.parts)):
i_start = shape.shape.parts[i]
if i==len(shape.shape.parts)-1:
i_end = len(shape.shape.points)
else:
i_end = shape.shape.parts[i+1]
x = [i[0] for i in shape.shape.points[i_start:i_end]]
y = [i[1] for i in shape.shape.points[i_start:i_end]]
plt.plot(x,y, color = 'black')
plt.show()
And I am able to read in my data and plot it:
import pickle
from matplotlib import pyplot as mp
Tfile = '/path/to/file.pkl'
with open(Tfile) as f:
reshapeT = pickle.load(f)
mp.matshow(reshapeT)
The problem is reshapeT has dimensions of 536 x 592, and is a subdomain of the US. However, I do have information about the top-left corner of the reshapeT grid (lat / long) as well as the spacing between each pixel (0.01)
My question is: How do I overlay the reshapeT data ontop of the shapefile domain?

If I understand you correctly you would like to overlay a 536x592 numpy array over a specifc part of a plotted shapefile. I would suggest you use Matplotlib's imwshow() method, with the extent parameter, which allows you to place the image within the plot.
Your way of plotting the shapefile is fine, however, if you have the possibility to use geopandas, it will dramatically simplify things. Plotting the shapefile will reduce to the following lines:
import geopandas as gpd
sf = gpd.read_file("/path/to/USA.shp")
ax1 = sf.plot(edgecolor='black', facecolor='none')
As you have done previously, let's load the array data now:
import pickle
Tfile = '/path/to/file.pkl'
with open(Tfile) as f:
reshapeT = pickle.load(f)
Now in order to be able to plot numpy array in the correct position, we first need to calculate its extent (the area which it will cover expressed in coordinates). You mentioned that you have information about the top-left corner and the resolution (0.01) - that's all we need. In the following I'm assuming that the lat/lon information about the top-left corner is saved in the the top_left_lat and top_left_lon variables. The extent needs to be passed in a tuple with a value for each of the edges (in the order left, right, bottom, top).
Hence, our extent can be calculated as follows:
extent_mat = (top_left_lon, top_left_lon + reshapeT.shape[1] * 0.01, top_left_lat - reshapeT.shape[0] * 0.01, top_left_lat)
Finally, we plot the matrix onto the same axes object, ax1, on which we already plotted the shape file to the calculated extent:
# Let's turn off autoscale first. This prevents
# the view of the plot to be limited to the image
# dimensions (instead of the entire shapefile). If you prefer
# that behaviour, just remove the following line
ax1.autoscale(False)
# Finally, let's plot!
ax1.imshow(reshapeT, extent=extent_mat)

Related

Display geographical points using geopandas

I want to display points on the map using a shape file as a map and a csv with coordinates. The code works but I don't understand how to show the figure map.
My questions are: how to display the points? What is "WnvPresent"? How can i just display the map and the points, not as a split between negative and positive but as a hole?
Website from where i downloaded the shp file: https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/countries
Website from where the idea comes from: https://towardsdatascience.com/geopandas-101-plot-any-data-with-a-latitude-and-longitude-on-a-map-98e01944b972
import pandas as pd
import matplotlib.pyplot as plt
import descartes
import geopandas as gpd
from shapely.geometry import Point, Polygon
%matplotlib inline
#read map data in form of .shp
street_map = gpd.read_file(r"C:\Users\stetc\Desktop\images/portofolio\ref-countries-2016-01m.shp")
#create the map
fig,ax = plt.subplots(figsize=(15,15))
street_map.plot(ax = ax)
#read given data
df = pd.read.file(r"C:\Users\stetc\Documents\full_dataset.csv")
#the next step is to get the data in the right format. The way we do this is by turning our regular Pandas DataFrame into a geo-DataFrame, which will require us to specify as parameters the original DataFrame, our coordinate reference system (CRS), and the geometry of our new DataFrame. In order to format our geometry appropriately, we will need to convert the longitude and latitude into Points (we imported Point from shapely above), so first let’s read in the training data-set and specify the EPSG:4326 CRS like so
crs = {"init":"epsg:4326"}
#create points using longitude and lat from the data set
geometry = [Point(xy) for xy in zip (df["Longitude"], df["Latitude"])]
#Create a GeoDataFrame
geo_df =gpd.GeoDataFrame (df, #specify out data
crs=crs, # specify the coordinates reference system
geometry = geometry #specify the geometry list created
)
fig,ax = plt.subplots(figsize = (15,15))
street_map.plot (ax = ax, alpha = 0.4 , color="grey" )
geo_df[geo_df["WnvPresent"]==0].plot(ax=ax,markersize=20, color = "blue", marker="o",label="Neg")
geo_df[geo_df["WnvPresent"]==1].plot(ax=ax,markersize=20, color = "red", marker="o",label="Pos")
plt.legend(prop={"size":15})
WnvPresent is just a column used in the example to plot two different colours (I would do it differently, but that is for another discussion), you can ignore that if your goal is to plot points only.
Try the code below. I have also added zorder to ensure that points are on top of the street_map.
fig, ax = plt.subplots(figsize=(15,15))
street_map.plot(ax=ax, alpha=0.4, color="grey", zorder=1)
geo_df.plot(ax=ax, markersize=20, color="blue", marker="o", zorder=2)
In the first step you create the figure, then you add street_map to ax and then geo_df to the same ax. The last line answers your question "how to display the points?". Keep in mind that both layers has to be in the same CRS (assuming epsg 4326 from your code), otherwise layers won't overlap.
A bit more on plotting is in geopandas docs - https://geopandas.readthedocs.io/en/latest/mapping.html and on CRS here https://geopandas.readthedocs.io/en/latest/projections.html.

python- plot heatmap at given coordinates using basemap

I am trying to make a map in basemap using pcolormesh (and I'm open to other methods). I have a csv file with coordinates, and one with the corresponding density value. (Should they be in one file?) I am trying to load the values as a numpy array and then plot the map, but I am unsure as how to correspond the density to the point. My map is currently just displaying blue everywhere, so I think it is just counting each coordinate and displaying it. NOTE: In the example code below, I created a CSV that has the coordinates and the density, for ease of testing.
Ideally, I have a range of values in density and the lower will be blue and the higher will be red. I am just very confused by how to put the density in there.
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np
m = Basemap(projection='npstere',boundinglat=60,lon_0=270,resolution='l')
m.drawcoastlines()
m.drawcounties()
array = np.genfromtxt("fake.csv", delimiter=",", skip_header=1)
lats = array[0,:]
lons = array[1,:]
nx = 360
ny = 180
lon_bins = np.linspace(-180, 180, nx)
lat_bins = np.linspace(70, 90, ny)
density, lat_edges, lon_edges = np.histogram2d(lats, lons, [lat_bins, lon_bins])
lon_bins_2d, lat_bins_2d = np.meshgrid(lon_bins, lat_bins)
xs, ys = m(lon_bins_2d, lat_bins_2d)
density = np.hstack((density, np.zeros((density.shape[0], 1))))
density = np.vstack((density, np.zeros((density.shape[1]))))
plt.pcolormesh(xs, ys, density, cmap="jet", shading='gouraud')
plt.show()
Admittedly, some of this code is patched together from googling for help, and this currently produces my desired map but, instead of displaying density, it's just a big blue blob. How do I get the density value to correspond to each coordinate? Thanks!

Heat map generation using coordinate points

I am new in Python. The answer to my question might be available in the StackOverflow, but honestly speaking, I tried almost all the codes and suggestions available in the StackOverflow.
My problem: Almost the same as it is described here. I have coordinate points (x and y) and the corresponding value (p) as a .csv file. I am reading that file using pandas.
df = pd.read_csv("example.csv")
The example.csv file can be download from here. Let an image of size 2000 x 2000.
Task:
Based on the x and y coordinate points in the excel sheet, I have to locate the point in that image.
Lets, A is an image and A(x,y) is any point within A. Now I have to generate a heat map in such a way so that 50 pixels from x and 50 pixels fromy i.e., A(x,y), A(x+50, y), A(x, y+50) and A(x+50, y+50) contains p corresponding to that coordinate points.
I found this link which is very helpful and serves my issue, but the problem is some more modifications are necessary for my datasets.
The code which is available in the above link:
#!/usr/bin/python3
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from skimage import io
from skimage.color import rgb2gray
import matplotlib as mpl
# Read original image
img = io.imread('img.jpg')
# Get the dimensions of the original image
x_dim, y_dim, z_dim = np.shape(img)
# Create heatmap
heatmap = np.zeros((x_dim, y_dim), dtype=float)
# Read CSV with a Pandas DataFrame
df = pd.read_csv("data.csv")
# Set probabilities values to specific indexes in the heatmap
for index, row in df.iterrows():
x = np.int(row["x"])
y = np.int(row["y"])
x1 = np.int(row["x1"])
y1 = np.int(row["y1"])
p = row["Probability value"]
heatmap[x:x1,y:y1] = p
# Plot images
fig, axes = plt.subplots(1, 2, figsize=(8, 4))
ax = axes.ravel()
ax[0].imshow(img)
ax[0].set_title("Original")
fig.colorbar(ax[0].imshow(img), ax=ax[0])
ax[1].imshow(img, vmin=0, vmax=1)
ax[1].imshow(heatmap, alpha=.5, cmap='jet')
ax[1].set_title("Original + heatmap")
# Specific colorbar
norm = mpl.colors.Normalize(vmin=0,vmax=2)
N = 11
cmap = plt.get_cmap('jet',N)
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
sm.set_array([])
plt.colorbar(sm, ticks=np.linspace(0,1,N),
boundaries=np.arange(0,1.1,0.1))
fig.tight_layout()
plt.show()
Issues which I am facing when using this code:
This code is generating a heat map of square edges, but I am expecting a smooth edge. I know Gaussian distribution might solve this problem. But I am new in python and I don't know how to implement the Gaussian Distribution in my dataset.
The regions which don't belong to the coordinate points also generating a layer of color. As a result in an overlayed image those layer covering the background of original images. In one sentence I want the background of the heat map will be transparent so that overlays will not create any problem in showing the regions which are not covered by the coordinate points.
Any leads will be highly appreciated.
Your code is perfect. Just change only one line, then your both issues will be solved.
Before changes:
ax[1].imshow(heatmap, alpha=.5, cmap='jet')
After changes:
ax[1].imshow(heatmap, alpha=.5, cmap='coolwarm', interpolation='gaussian')
Though above changes will solve your issue, but if you want then for additional transparency, you can use below function
def transparent_cmap(cmap, N=255):
"Copy colormap and set alpha values"
mycmap = cmap
mycmap._init()
mycmap._lut[:,-1] = np.linspace(0, 0.8, N+4)
return mycmap
mycmap = transparent_cmap(plt.cm.coolwarm)
In that case, your previous code line will change like below:
ax[1].imshow(heatmap, alpha=.5, cmap=mycmap, vmin=0, vmax=1)
The question you linked uses plotly. If you don't want to use that and want to simply smooth the way your data looks, I suggest just using a gaussian filter using scipy.
At the top, import:
import seaborn as sns
from scipy.ndimage.filters import gaussian_filter
Then use it like this:
df_smooth = gaussian_filter(df, sigma=1)
sns.heatmap(df_smooth, vmin=-40, vmax=150, cmap ="coolwarm" , cbar=True , cbar_kws={"ticks":[-40,150,-20,0,25,50,75,100,125]})
You can change the amount of smoothing, using e.g. sigma=3, or any other number that gives you the amount of smoothing you want.
Keep in mind that that will also "smooth out" any maximum data peaks you have, so your minimum and maximum data will not be the same that you specified in your normalization anymore. To still get good looking heatmaps I would suggest not using fixed values for your vmin and vmax, but:
sns.heatmap(df_smooth, vmin=np.min(df_smooth), vmax=np.max(df_smooth), cmap ="coolwarm" , cbar=True , cbar_kws={"ticks":[-40,150,-20,0,25,50,75,100,125]})
In case that you Gaussian filter fulfill your expectations you mentioned you can even implement Gaussian normalization on your data directly.

Cartopy map fill entire axis

I would like to plot data on a grid (which is in LCC projection) with Cartopy, so that the data fills the entire axis (and also axes, but that is not the issue here).
To make it clearer, here is what I do with Cartopy:
import cartopy.crs as ccrs
import numpy as np
import pyproj as p4
from mpl_toolkits.basemap import Basemap
lalo = #read latitudes and longitudes of my grid defined in a special LCC projection (below)
lat = np.reshape(lalo[:,1],(ny,nx))
lon = np.reshape(lalo[:,0],(ny,nx))
minlat = lat[0,0]
maxlat = lat[-1,-1]
minlon = lon[0,0]
maxlon = lon[-1,-1]
Z = np.ones((ny,nx)) #some data
#grid definition for cartopy:
myproj = ccrs.LambertConformal(central_longitude=13.3333, central_latitude=47.5,
false_easting=400000, false_northing=400000,
secant_latitudes=(46, 49))
fig = plt.figure()
ax = plt.axes(projection = myproj)
plt.contourf(lon, lat, Z)#, transform=myproj)
#no difference with transform option as lon,lat are already in myproj projection
The result is an image which does not fill the entire axis, but looks like this:
When using Basemap like this:
a=6377397.155
rf=299.1528128
b= a*(1 - 1/rf)
m = Basemap(projection='lcc', resolution='h', rsphere=(a,b),
llcrnrlon=minlon,llcrnrlat=minlat,urcrnrlon=maxlon,urcrnrlat=maxlat,
llcrnrx=400000, llcrnry=400000,
lat_1=46, lat_2=49, lat_0=47.5, lon_0=13.3333, ax=ax)
x,y = m(lon,lat)
m.contourf(x,y,Z)
I get the following (desired) image:
And finally, when using proj4 to convert lon and lat using this definition p4.Proj('+proj=lcc +lat_1=46N +lat_2=49N +lat_0=47.5N +lon_0=13.3333 +ellps=bessel +x_0=400000 +y_0=400000') I again get the desired image:
Is there any possibility to achieve this in cartopy as well?
In other words, I would like to have a plot where the data shows up in a perfect rectangle, and the background map is distorted accordingly, i.e. something like the opposite of this example (cannot install iris package, otherwise I would have tried with this example)
I have tried a few things like:
building a custom class for my projection as done here, just to be sure that the parameters are all set correctly (as in my proj4 definition).
played around with aspect ratios, but they only affect the axes not the axis,
and a few more things.
Any help is greatly appreciated!
The important piece of information that is missing here is that your data is in lats and lons, not in the Cartesian transverse Mercator coordinate system. As a result you will need to use a Cartesian coordinate system which speaks lats and lons (spherical contouring has not been implemented at this point). Such a coordinate system exists in the form of the PlateCarree crs - so simply passing this as the transform of the contoured data should put your data in the right place.
plt.contourf(lon, lat, Z, transform=ccrs.PlateCarree())
This really highlights the fact that the default coordinate system of your data, is the same as that of the map, which in most cases is not longitudes and latitudes - the only way to change the CRS of your data is by passing the transform keyword.
HTH

How to use set clipped path for Basemap polygon

I want to use imshow (for example) to display some data inside the boundaries of a country (for the purposes of example I chose the USA) The simple example below illustrates what I want:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import RegularPolygon
data = np.arange(100).reshape(10, 10)
fig = plt.figure()
ax = fig.add_subplot(111)
im = ax.imshow(data)
poly = RegularPolygon([ 0.5, 0.5], 6, 0.4, fc='none',
ec='k', transform=ax.transAxes)
im.set_clip_path(poly)
ax.add_patch(poly)
ax.axis('off')
plt.show()
The result is:
Now I want to do this but instead of a simple polygon, I want to use the complex shape of the USA. I have created some example data contained in the array of "Z" as can be seen in the code below. It is this data that I want to display, using a colourmap but only within the boundaries of mainland USA.
So far I have tried the following. I get a shape file from here contained in "nationp010g.shp.tar.gz" and I use the Basemap module in python to plot the USA. Note that this is the only method I have found which gives me the ability get a polygon of the area I need. If there are alternative methods I would also be interested in them. I then create a polygon called "mainpoly" which is almost the polygon I want coloured in blue:
Notice how only one body has been coloured, all other disjoint polygons remain white:
So the area coloured blue is almost what I want, note that there are unwanted borderlines near canada because the border actually goes through some lakes, but that is a minor problem. The real problem is, why doesn't my imshow data display inside the USA? Comparing my first and second example codes I can't see why I don't get a clipped imshow in my second example, the way I do in the first. Any help would be appreciated in understanding what I am missing.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap as Basemap
from matplotlib.patches import Polygon
# Lambert Conformal map of lower 48 states.
m = Basemap(llcrnrlon=-119,llcrnrlat=22,urcrnrlon=-64,urcrnrlat=49,
projection='lcc',lat_1=33,lat_2=45,lon_0=-95)
shp_info = m.readshapefile('nationp010g/nationp010g', 'borders', drawbounds=True) # draw country boundaries.
for nshape,seg in enumerate(m.borders):
if nshape == 1873: #This nshape denotes the large continental body of the USA, which we want
mainseg = seg
mainpoly = Polygon(mainseg,facecolor='blue',edgecolor='k')
nx, ny = 10, 10
lons, lats = m.makegrid(nx, ny) # get lat/lons of ny by nx evenly space grid.
x, y = m(lons, lats) # compute map proj coordinates.
Z = np.zeros((nx,ny))
Z[:] = np.NAN
for i in np.arange(len(x)):
for j in np.arange(len(y)):
Z[i,j] = x[0,i]
ax = plt.gca()
im = ax.imshow(Z, cmap = plt.get_cmap('coolwarm') )
im.set_clip_path(mainpoly)
ax.add_patch(mainpoly)
plt.show()
Update
I realise that the line
ax.add_patch(mainpoly)
does not even add the polygon shape to a plot. Am I not using it correctly? As far as I know mainpoly was calculated correctly using the Polygon() method. I checked that the coordinate inputs are a sensible:
plt.plot(mainseg[:,0], mainseg[:,1] ,'.')
which gives
I have also considered about this problem for so long.
And I found NCL language has the function to mask the data outside some border.
Here is the example:
http://i5.tietuku.com/bdb1a6c007b82645.png
The contourf plot only show within China border. Click here for the code.
I know python has a package called PyNCL which support all NCL code in Python framework.
But I really want to plot this kind of figure using basemap. If you have figured it out, please post on the internet. I'll learn at the first time.
Thanks!
Add 2016-01-16
In a way, I have figured it out.
This is my idea and code, and it's inspired from this question I have asked today.
My method:
1. Make the shapefile of the interesting area(like U.S) into shapely.polygon.
2. Test each value point within/out of the polygon.
3. If the value point is out of the study area, mask it as np.nan
Intro
* the polygon xxx was a city in China in ESRI shapefile format.
* fiona, shapely package were used here.
# generate the shapely.polygon
shape = fiona.open("xxx.shp")
pol = shape.next()
geom = shape(pol['geometry'])
poly_data = pol["geometry"]["coordinates"][0]
poly = Polygon(poly_data)
It shows like:
http://i4.tietuku.com/2012307faec02634.png
### test the value point
### generate the grid network which represented by the grid midpoints.
lon_med = np.linspace((xi[0:2].mean()),(xi[-2:].mean()),len(x_grid))
lat_med = np.linspace((yi[0:2].mean()),(yi[-2:].mean()),len(y_grid))
value_test_mean = dsu.mean(axis = 0)
value_mask = np.zeros(len(lon_med)*len(lat_med)).reshape(len(lat_med),len(lon_med))
for i in range(0,len(lat_med),1):
for j in range(0,len(lon_med),1):
points = np.array([lon_med[j],lat_med[i]])
mask = np.array([poly.contains(Point(points[0], points[1]))])
if mask == False:
value_mask[i,j] = np.nan
if mask == True:
value_mask[i,j] = value_test_mean[i,j]
# Mask the np.nan value
Z_mask = np.ma.masked_where(np.isnan(so2_mask),so2_mask)
# plot!
fig=plt.figure(figsize=(6,4))
ax=plt.subplot()
map = Basemap(llcrnrlon=x_map1,llcrnrlat=y_map1,urcrnrlon=x_map2,urcrnrlat=y_map2)
map.drawparallels(np.arange(y_map1+0.1035,y_map2,0.2),labels= [1,0,0,1],size=14,linewidth=0,color= '#FFFFFF')
lon_grid = np.linspace(x_map1,x_map2,len(x_grid))
lat_grid = np.linspace(y_map1,y_map2,len(y_grid))
xx,yy = np.meshgrid(lon_grid,lat_grid)
pcol =plt.pcolor(xx,yy,Z_mask,cmap = plt.cm.Spectral_r ,alpha =0.75,zorder =2)
result
http://i4.tietuku.com/c6620c5b6730a5f0.png
http://i4.tietuku.com/a22ad484fee627b9.png
original result
http://i4.tietuku.com/011584fbc36222c9.png

Categories

Resources