I have data set of bike sharing. The data has lan and long for each station. A sample of data is like below. I want to find each 3 station that are close to each other in term of coordinate and sum up the count for each of subcategory (3 closest points).
I know how we can calculate the distance between two point. but I don't know how to program this, in term of finding each 3 subset of closest coordinates.
The code for calculating distance between 2 point:
from math import cos, asin, sqrt, pi
def distance(lat1, lon1, lat2, lon2):
p = pi/180
a = 0.5 - cos((lat2-lat1)*p)/2 + cos(lat1*p) * cos(lat2*p) * (1-cos((lon2-lon1)*p))/2
return 12742 * asin(sqrt(a))
The data :
start_station_name start_station_latitude start_station_longitude. count
0 Schous plass 59.920259 10.760629. 2
1 Pilestredet 59.926224 10.729625. 4
2 Kirkeveien 59.933558 10.726426. 8
3 Hans Nielsen Hauges plass 59.939244 10.774319. 0
4 Fredensborg 59.920995 10.750358. 8
5 Marienlyst 59.932454 10.721769. 9
6 Sofienbergparken nord 59.923229 10.766171. 3
7 Stensparken 59.927140 10.730981. 4
8 Vålerenga 59.908576 10.786856. 6
9 Schous plass trikkestopp 59.920728 10.759486. 5
10 Griffenfeldts gate 59.933703 10.751930. 4
11 Hallénparken 59.931530 10.762169. 8
12 Alexander Kiellands Plass 59.928058 10.751397. 3
13 Uranienborgparken 59.922485 10.720896. 2
14 Sommerfrydhagen 59.911453 10.776072 1
15 Vestkanttorvet 59.924403 10.713069. 8
16 Bislettgata 59.923834 10.734638 9
17 Biskop Gunnerus' gate 59.912334 10.752292 1
18 Botanisk Hage sør 59.915282 10.769620 1
19 Hydroparken. 59.914145 10.715505 1
20 Bøkkerveien 59.927375 10.796015 1
what I want is :
closest count_sum
Schous plass, Pilestredet, Kirkeveien. 14
.
.
.
The Error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-49-1a4d3a72c23d> in <module>
7 for idx_1, idx_2 in [(0, 1), (1, 2), (0, 2)]:
8 total_distance += distance(
----> 9 combination[idx_1]['start_station_latitude'],
10 combination[idx_1]['start_station_longitude'],
11 combination[idx_2]['start_station_latitude'],
TypeError: 'int' object is not subscriptable
You could try all possible combinations with itertools.combinations() and save station pairs with shortest total distance.
from itertools import combinations
best = (float('inf'), None)
for combination in combinations(data, 3):
total_distance = 0
for idx_1, idx_2 in [(0, 1), (1, 2), (0, 2)]:
total_distance += distance(
combination[idx_1]['start_station_latitude'],
combination[idx_1]['start_station_longitude'],
combination[idx_2]['start_station_latitude'],
combination[idx_2]['start_station_longitude'],
)
if total_distance < best[0]:
best = (total_distance, combination)
print(f'Best combination is {best[1]}, total distance: {best[0]}')
Keep in mind that there's still room for optimization, for example caching distance between two stations like
lru_cache(maxsize=None)
def distance(lat1, lon1, lat2, lon2):
p = pi/180
...
Related
I have to generate a sine curve of the positive part only between two values. The idea is my variable say monthly-averaged RH, which has 12 data points in a year (i.e. time series) varies between 50 and 70 in a sinusoidal way. The first and the last data points end at 50.
Can anyone help how I can generate this curve/function for the curve to get values of all intermediate data points? I am trying to use numpy/scipy for this.
Best,
Debayan
This is basic trig.
import math
for i in range(12):
print( i, 50 + 20 * math.sin( math.pi * i / 12 ) )
Output:
0 50.0
1 55.17638090205041
2 60.0
3 64.14213562373095
4 67.32050807568876
5 69.31851652578136
6 70.0
7 69.31851652578136
8 67.32050807568878
9 64.14213562373095
10 60.0
11 55.17638090205042
Here's a sample dataset with observations from 4 different trips (there are 4 unique trip IDs):
trip_id time_interval speed
0 8a8449635c10cc4b8e7841e517f27e2652c57ea3 873.96 0.062410
1 8a8449635c10cc4b8e7841e517f27e2652c57ea3 11.46 0.000000
2 8a8449635c10cc4b8e7841e517f27e2652c57ea3 903.96 0.247515
3 8a8449635c10cc4b8e7841e517f27e2652c57ea3 882.48 0.121376
4 8a8449635c10cc4b8e7841e517f27e2652c57ea3 918.78 0.185405
5 8a8449635c10cc4b8e7841e517f27e2652c57ea3 885.96 0.122147
6 f7fd70a8c14e43d8be91ef180e297d7195bbe9b0 276.60 0.583178
7 84d14618dcb30c28520cb679e867593c1d29213e 903.48 0.193313
8 84d14618dcb30c28520cb679e867593c1d29213e 899.34 0.085377
9 84d14618dcb30c28520cb679e867593c1d29213e 893.46 0.092259
10 84d14618dcb30c28520cb679e867593c1d29213e 849.36 0.350341
11 3db35f9835db3fe550de194b55b3a90a6c1ecb97 10.86 0.000000
12 3db35f9835db3fe550de194b55b3a90a6c1ecb97 919.50 0.071119
I am trying to compute the acceleration of each unique trip from one point to another.
Example:
first acceleration value will be computed using rows 0 and 1 (0 initial value; 1 final value)
second acceleration value will be computed using rows 1 and 2 (1 initial value; 2 final value)
... and so on.
As I want to compute this for each individual trip based on trip_id, this is what I attempted:
def get_acceleration(dataset):
##### INITIALISATION VARIABLES #####
# Empty string for the trip ID
current_trip_id = ""
# Copy of the dataframe
dataset2 = dataset.copy()
# Creating a new column for the acceleration between observations of the same trip
# all rows have a default value of 0
dataset2["acceleration"] = 0
##### LOOP #####
for index,row in dataset.iterrows():
# Checking if we are looking at the same trip
# when looking at the same trip, the default values of zero are replaced
# by the calculated trip characteristic
if row["trip_id"] == current_trip_id:
# Final speed and time
final_speed = row["speed"]
final_time = row["time_interval"]
print(type(final_speed))
# Computing the acceleration (delta_v/ delta_t)
acceleration = (final_speed[1] - initial_speed[0])/(final_time[1] - initial_time[0])
# Adding the output to the acceleration column
dataset2.loc[index, "acceleration"] = acceleration
##### UPDATING THE LOOP #####
current_trip_id = row["trip_id"]
# Initial speed and time
initial_speed = row["speed"]
initial_time = row["time_interval"]
return dataset2
However, I get the error:
<ipython-input-42-0255a952850b> in get_acceleration(dataset)
27
28 # Computing the acceleration (delta_v/ delta_t)
---> 29 acceleration = (final_speed[1] - initial_speed[0])/(final_time[1] - initial_time[0])
30
31 # Adding the output to the acceleration column
TypeError: 'float' object is not subscriptable
How could I fix this error and compute the acceleration?
UPDATE:
After using the answer below, to avoid division by zero just add an if and else statements.
delta_speed = final_speed - initial_speed
delta_time = final_time - initial_time
# Computing the acceleration (delta_v/ delta_t)
if delta_time != 0:
acceleration = (delta_speed)/(delta_time)
else:
acceleration = 0
It works
acceleration = (final_speed - initial_speed)/(final_time - initial_time)
trip_id
time_interval
speed
acceleration
0
8a8449635c10cc4b8e7841e517f27e2652c57ea3
873.96
0.062410
0.000000
1
8a8449635c10cc4b8e7841e517f27e2652c57ea3
11.46
0.000000
0.000072
2
8a8449635c10cc4b8e7841e517f27e2652c57ea3
903.96
0.247515
0.000277
3
8a8449635c10cc4b8e7841e517f27e2652c57ea3
882.48
0.121376
0.005872
4
8a8449635c10cc4b8e7841e517f27e2652c57ea3
918.78
0.185405
0.001764
5
8a8449635c10cc4b8e7841e517f27e2652c57ea3
885.96
0.122147
0.001927
6
f7fd70a8c14e43d8be91ef180e297d7195bbe9b0
276.60
0.583178
0.000000
7
84d14618dcb30c28520cb679e867593c1d29213e
903.48
0.193313
0.000000
8
84d14618dcb30c28520cb679e867593c1d29213e
899.34
0.085377
0.026071
9
84d14618dcb30c28520cb679e867593c1d29213e
893.46
0.092259
0.001170
10
84d14618dcb30c28520cb679e867593c1d29213e
849.36
0.350341
-0.005852
11
3db35f9835db3fe550de194b55b3a90a6c1ecb97
10.86
0.000000
0.000000
12
3db35f9835db3fe550de194b55b3a90a6c1ecb97
919.50
0.071119
0.000078
I would like to compute the distance between two coordinates. I know I can compute the haversine distance between two points. However, I was wondering if there is an easier way of doing it instead of creating a loop using the formula iterating over the entire columns (also getting errors in the loop).
Here's some data for the example
# Random values for the duration from one point to another
random_values = random.sample(range(2,20), 8)
random_values
# Creating arrays for the coordinates
lat_coor = [11.923855, 11.923862, 11.923851, 11.923847, 11.923865, 11.923841, 11.923860, 11.923846]
lon_coor = [57.723843, 57.723831, 57.723839, 57.723831, 57.723827, 57.723831, 57.723835, 57.723827]
df = pd.DataFrame(
{'duration': random_values,
'latitude': lat_coor,
'longitude': lon_coor
})
df
duration latitude longitude
0 5 11.923855 57.723843
1 2 11.923862 57.723831
2 10 11.923851 57.723839
3 19 11.923847 57.723831
4 16 11.923865 57.723827
5 4 11.923841 57.723831
6 13 11.923860 57.723835
7 3 11.923846 57.723827
To compute the distance this is what I've attempted:
# Looping over each row to compute the Haversine distance between two points
# Earth's radius (in m)
R = 6373.0 * 1000
lat = df["latitude"]
lon = df["longitude"]
for i in lat:
lat1 = lat[i]
lat2 = lat[i+1]
for j in lon:
lon1 = lon[i]
lon2 = lon[i+1]
dlon = lon2 - lon1
dlat = lat2 - lat1
# Haversine formula
a = math.sin(dlat / 2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon / 2)**2
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
distance = R * c
print(distance) # in m
However, this is the error I get:
The two points to compute the distance should be taken from the same column.
first distance value:
11.923855 57.723843 (point1/observation1)
11.923862 57.723831 (point2/observation2)
second distance value:
11.923862 57.723831 (point1/observation2)
11.923851 57.723839(point2/observation3)
third distance value:
11.923851 57.723839(point1/observation3)
11.923847 57.723831 (point1/observation4)
... (and so on)
OK, first you can create a dataframe that combine each measurement with the previous one:
df2 = pd.concat([df.add_suffix('_pre').shift(), df], axis=1)
df2
This outputs:
duration_pre latitude_pre longitude_pre duration latitude longitude
0 NaN NaN NaN 5 11.923855 57.723843
1 5.0 11.923855 57.723843 2 11.923862 57.723831
2 2.0 11.923862 57.723831 10 11.923851 57.723839
…
Then create a haversine function and apply it to the rows:
def haversine(lat1, lon1, lat2, lon2):
import math
R = 6373.0 * 1000
dlon = lon2 - lon1
dlat = lat2 - lat1
a = math.sin(dlat / 2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon / 2)**2
return R *2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
df2.apply(lambda x: haversine(x['latitude_pre'], x['longitude_pre'], x['latitude'], x['longitude']), axis=1)
which computes for each row the distance with the previous row (first one is thus NaN).
0 NaN
1 75.754755
2 81.120210
3 48.123604
…
And, if you want to include a new column in the original dataframe in one line:
df['distance'] = pd.concat([df.add_suffix('_pre').shift(), df], axis=1).apply(lambda x: haversine(x['latitude_pre'], x['longitude_pre'], x['latitude'], x['longitude']), axis=1)
Output:
duration latitude longitude distance
0 5 11.923855 57.723843 NaN
1 2 11.923862 57.723831 75.754755
2 10 11.923851 57.723839 81.120210
3 19 11.923847 57.723831 48.123604
4 16 11.923865 57.723827 116.515304
5 4 11.923841 57.723831 154.307571
6 13 11.923860 57.723835 122.794838
7 3 11.923846 57.723827 98.115312
I understood that you want to get the pairwise haversine distance between all points in your df. Here's how this could be done:
Be careful when using this approach with a lot of points as it generates a lot of columns quickly
Setup
import random
random_values = random.sample(range(2,20), 8)
random_values
# Creating arrays for the coordinates
lat_coor = [11.923855, 11.923862, 11.923851, 11.923847, 11.923865, 11.923841, 11.923860, 11.923846]
lon_coor = [57.723843, 57.723831, 57.723839, 57.723831, 57.723827, 57.723831, 57.723835, 57.723827]
df = pd.DataFrame(
{'duration': random_values,
'latitude': lat_coor,
'longitude': lon_coor
})
Get radians
import math
df['lat_rad'] = df.latitude.apply(math.radians)
df['long_rad'] = df.latitude.apply(math.radians)
Calculate pairwise distances
from sklearn.metrics.pairwise import haversine_distances
for idx_from, from_point in df.iterrows():
for idx_to, to_point in df.iterrows():
column_name = f"Distance_to_point_{idx_from}"
haversine_matrix = haversine_distances([[from_point.lat_rad, from_point.long_rad], [to_point.lat_rad, to_point.long_rad]])
point_distance = haversine_matrix[0][1] * 6371000/1000
df.loc[idx_to, column_name] = point_distance
df
duration latitude longitude lat_rad long_rad Distance_to_point_0 Distance_to_point_1 Distance_to_point_2 Distance_to_point_3 Distance_to_point_4 Distance_to_point_5 Distance_to_point_6 Distance_to_point_7
0 3 11.923855 57.723843 0.20811052928038845 0.20811052928038845 0.0 0.0010889626934743966 0.0006222644021223135 0.001244528808978787 0.0015556609862946524 0.002177925427923575 0.000777830496776312 0.0014000949117650525
1 13 11.923862 57.723831 0.2081106514534361 0.2081106514534361 0.0010889626934743966 0.0 0.0017112270955967099 0.002333491502453183 0.0004666982928202561 0.00326688812139797 0.00031113219669808446 0.0024890576052394482
2 14 11.923851 57.723839 0.2081104594672184 0.2081104594672184 0.0006222644021223135 0.0017112270955967099 0.0 0.0006222644068564735 0.002177925388416966 0.0015556610258012616 0.0014000948988986254 0.0007778305096427389
3 4 11.923847 57.723831 0.20811038965404832 0.20811038965404832 0.001244528808978787 0.002333491502453183 0.0006222644068564735 0.0 0.0028001897952734385 0.0009333966189447881 0.002022359305755099 0.0001555661027862654
4 5 11.923865 57.723827 0.20811070381331365 0.20811070381331365 0.0015556609862946524 0.0004666982928202561 0.002177925388416966 0.0028001897952734385 0.0 0.003733586414218225 0.0007778304895183407 0.002955755898059704
5 7 11.923841 57.723831 0.20811028493429318 0.20811028493429318 0.002177925427923575 0.00326688812139797 0.0015556610258012616 0.0009333966189447881 0.003733586414218225 0.0 0.002955755924699886 0.0007778305161585227
6 9 11.92386 57.723835 0.20811061654685106 0.20811061654685106 0.000777830496776312 0.00031113219669808446 0.0014000948988986254 0.002022359305755099 0.0007778304895183407 0.002955755924699886 0.0 0.002177925408541364
7 8 11.923846 57.723827 0.20811037220075576 0.20811037220075576 0.0014000949117650525 0.0024890576052394482 0.0007778305096427389 0.0001555661027862654 0.002955755898059704 0.0007778305161585227 0.002177925408541364 0.0
You are confusing the index versus the values themselves, so you are getting a key error because there is no lat[i] (e.g., lat[11.923855]) in your example. After fixing i to be the index, your code would go beyond the last row of lat and lon with your [i+1]. Since you want to compare each row to the previous row, how about starting at index 1 and looking back by 1, then you won't go out of range. This edited version of your code does not crash:
for i in range(1, len(lat)):
lat1 = lat[i - 1]
lat2 = lat[i]
for j in range(1, len(lon)):
lon1 = lon[i - 1]
lon2 = lon[i]
dlon = lon2 - lon1
dlat = lat2 - lat1
# Haversine formula
a = math.sin(dlat / 2) ** 2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon / 2) ** 2
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
distance = R * c
print(distance) # in m
What I ultimately want to do is round the expected value of a discrete random variable distribution to a valid number in the distribution. For example if I am drawing evenly from the numbers [1, 5, 6], the expected value is 4 but I want to return the closest number to that (ie, 5).
from scipy.stats import *
xk = (1, 5, 6)
pk = np.ones(len(xk))/len(xk)
custom = rv_discrete(name='custom', values=(xk, pk))
print(custom.expect())
# 4.0
def round_discrete(discrete_rv_dist, val):
# do something here
return answer
print(round_discrete(custom, custom.expect()))
# 5.0
I don't know apriori what distribution will be used (ie might not be integers, might be an unbounded distribution), so I'm really struggling to think of an algorithm that is sufficiently generic. Edit: I just learned that rv_discrete doesn't work on non-integer xk values.
As to why I want to do this, I'm putting together a monte-carlo simulation, and want a "nominal" value for each distribution. I think that the EV is the most physically appropriate rather than the mode or median. I might have values in the downstream simulation that have to be one of several discrete choices, so passing a value that is not within that set is not acceptable.
If there's already a nice way to do this in Python that would be great, otherwise I can interpret math into code.
Here is R code that I think will do what you want, using Poisson data to illustrate:
set.seed(322)
x = rpois(100, 7) # 100 obs from POIS(7)
a = mean(x); a
[1] 7.16 # so 7 is the value we want
d = min(abs(x-a)); d # min distance btw a and actual Pois val
[1] 0.16
u = unique(x); u # unique Pois values observed
[1] 7 5 4 10 2 9 8 6 11 3 13 14 12 15
v = u[abs(u-a)==d]; v # unique val closest to a
[1] 7
Hope you can translate it to Python.
Another run:
set.seed(323)
x = rpois(100, 20)
a = mean(x); a
[1] 20.32
d = min(abs(x-a)); d
[1] 0.32
u = unique(x)
v = u[abs(u-a)==d]; v
[1] 20
x
[1] 17 16 20 23 23 20 19 23 21 19 21 20 22 25 13 15 19 19 14 27 19 30 17 19 23
[26] 16 23 26 33 16 11 23 14 21 24 12 18 20 20 19 26 12 22 24 20 22 17 23 11 19
[51] 19 26 17 17 11 17 23 21 26 13 18 28 22 14 17 25 28 24 16 15 25 26 22 15 23
[76] 27 19 21 17 23 21 24 23 22 23 18 25 14 24 25 19 19 21 22 16 28 18 11 25 23
u
[1] 17 16 20 23 19 21 22 25 13 15 14 27 30 26 33 11 24 12 18 28
Figured it out, and tested it working. If I plug my value X into the cdf, then I can plug that probability P = cdf(X) into the ppf. The values at ppf(P +- epsilon) will give me the closest values in the set to X.
Or more geometrically, for a discrete pmf, the point (X,P) will lie on a horizontal portion of the corresponding cdf. When you invert the cdf, (P,X) is now on a vertical section of the ppf. Taking P +- eps will give you the 2 nearest flat portions of the ppf connected to that vertical jump, which correspond to the valid values X1, X2. You can then do a simple difference to figure out which is closer to your target value.
import numpy as np
eps = np.finfo(float).eps
ev = custom.expect()
p = custom.cdf(ev)
ev_candidates = custom.ppf([p - eps, p, p + eps])
ev_candidates_distance = abs(ev_candidates - ev)
ev_closest = ev_candidates[np.argmin(ev_candidates_distance)]
print(ev_closest)
# 5.0
Terms:
pmf - probability mass function
cdf - cumulative distribution function (cumulative sum of the pdf)
ppf - percentage point function (inverse of the cdf)
eps - epsilon (smallest possible increment)
Would the function ceil from the math library help? For example:
from math import ceil
print(float(ceil(3.333333333333333)))
I am converting an application that uses matplotlib's toolkit Basemap to using Cartopy in preparation for moving from Python 2 to Python 3.
I have found similar functions in Cartopy for Basemap's 'addcyclic()' and 'maskoceans()',
However I cannot find something similar in either numpy or Cartopy for Basemap's shiftgrid() function.
This is the code using Basemap:
'''
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import cartopy
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import numpy as np
from mpl_toolkits.basemap import shiftgrid
bmap = Basemap(projection='ortho', lat_0=0, lon_0=0)
lons = np.arange(30, 410, 30)
lons[1] = 70
lats = np.arange(0, 100, 10)
data = np.indices((lats.shape[0], lons.shape[0]))
data = data[0] + data[1]
data, lons = shiftgrid(180., data, lons, start=False)
llons, llats = np.meshgrid(lons, lats)
x, y = bmap(llons, llats)
bmap.contourf(x, y, data)
bmap.drawcoastlines()
'''
The initial data:
data
'''
[[ 0 1 2 3 4 5 6 7 8 9 10 11 12]
[ 1 2 3 4 5 6 7 8 9 10 11 12 13]
[ 2 3 4 5 6 7 8 9 10 11 12 13 14]
[ 3 4 5 6 7 8 9 10 11 12 13 14 15]
[ 4 5 6 7 8 9 10 11 12 13 14 15 16]
[ 5 6 7 8 9 10 11 12 13 14 15 16 17]
[ 6 7 8 9 10 11 12 13 14 15 16 17 18]
[ 7 8 9 10 11 12 13 14 15 16 17 18 19]
[ 8 9 10 11 12 13 14 15 16 17 18 19 20]
[ 9 10 11 12 13 14 15 16 17 18 19 20 21]]
lons
[ 30 70 90 120 150 180 210 240 270 300 330 360 390]
After the 'data, lons = shiftgrid(180., data, lons, start=False)':
data
[[ 5 6 7 8 9 10 11 12 1 2 3 4 5]
[ 6 7 8 9 10 11 12 13 2 3 4 5 6]
[ 7 8 9 10 11 12 13 14 3 4 5 6 7]
[ 8 9 10 11 12 13 14 15 4 5 6 7 8]
[ 9 10 11 12 13 14 15 16 5 6 7 8 9]
[10 11 12 13 14 15 16 17 6 7 8 9 10]
[11 12 13 14 15 16 17 18 7 8 9 10 11]
[12 13 14 15 16 17 18 19 8 9 10 11 12]
[13 14 15 16 17 18 19 20 9 10 11 12 13]
[14 15 16 17 18 19 20 21 10 11 12 13 14]]
lons
[-180 -150 -120 -90 -60 -30 0 30 70 90 120 150 180]
'''
I have tried the following cartopy code to recreate what the Basemap shiftgrid did.
This is the Cartopy code, some things are commented out as I tried them at one time:
'''
DATA_CRS = ccrs.PlateCarree()
lons = np.arange(30, 410, 30)
lons[1] = 70
lats = np.arange(0, 100, 10)
data = np.indices((lats.shape[0], lons.shape[0]))
data = data[0] + data[1]
# data2 = np.roll(data, -5)
# lons2 = np.mod(lons2 - 180.0, 360.0) - 180.0
cm_lon = 0
#llons, llats = np.meshgrid(lons2, lats)
llons, llats = np.meshgrid(lons, lats)
PROJECTION = ccrs.Orthographic(central_longitude=cm_lon)
fig1 = plt.figure(num=1, figsize=(11, 8.5), dpi=150)
ax = plt.axes(projection=PROJECTION)
ax.add_feature(cfeature.COASTLINE, linewidths=0.7)
ax.add_feature(cfeature.BORDERS, edgecolor='black', linewidths=0.7)
ax.contourf(llons, llats, data, transform=ccrs.PlateCarree())
'''
The data and the longitudes as original and I just used the 'central_longitude' in the projection.
The Basemap image shows the entire globe but the Cartopy image only shows from the equator up.
The color of the data seems similar except for the far right side, so I'm concerned the data didn't map the same in Cartopy as it did in Basemap.
So, the question is... Is there anything equivalent to Basemap's shiftgrid() or do I need to figure out something similar to Basemap's shiftgrid() or just use the 'central_longitude' in the projection?
I don't seem to be able to paste the .png files.
Any help is really appreciated.
I have searched the web looking for equivalent functions but haven't found one for the shiftgrid().
Thank you.
I'm not aware of any shiftgrid equivalent. It may be worth opening an issue over on the CartoPy issue tracker requesting such a feature. It would help in doing so to mention a solid use case to help drive the functionality.
This must be the most inelegant solution, but what I have been doing for several of Basemap's useful features that are not (yet?) in cartopy, is just to copy the function definitions from Basemap's source code. It works fine. For example, shiftgrid:
def shiftgrid(lon0,datain,lonsin,start=True,cyclic=360.0):
"""
Shift global lat/lon grid east or west.
.. tabularcolumns:: |l|L|
============== ====================================================
Arguments Description
============== ====================================================
lon0 starting longitude for shifted grid
(ending longitude if start=False). lon0 must be on
input grid (within the range of lonsin).
datain original data with longitude the right-most
dimension.
lonsin original longitudes.
============== ====================================================
.. tabularcolumns:: |l|L|
============== ====================================================
Keywords Description
============== ====================================================
start if True, lon0 represents the starting longitude
of the new grid. if False, lon0 is the ending
longitude. Default True.
cyclic width of periodic domain (default 360)
============== ====================================================
returns ``dataout,lonsout`` (data and longitudes on shifted grid).
"""
if np.fabs(lonsin[-1]-lonsin[0]-cyclic) > 1.e-4:
# Use all data instead of raise ValueError, 'cyclic point not included'
start_idx = 0
else:
# If cyclic, remove the duplicate point
start_idx = 1
if lon0 < lonsin[0] or lon0 > lonsin[-1]:
raise ValueError('lon0 outside of range of lonsin')
i0 = np.argmin(np.fabs(lonsin-lon0))
i0_shift = len(lonsin)-i0
if ma.isMA(datain):
dataout = ma.zeros(datain.shape,datain.dtype)
else:
dataout = np.zeros(datain.shape,datain.dtype)
if ma.isMA(lonsin):
lonsout = ma.zeros(lonsin.shape,lonsin.dtype)
else:
lonsout = np.zeros(lonsin.shape,lonsin.dtype)
if start:
lonsout[0:i0_shift] = lonsin[i0:]
else:
lonsout[0:i0_shift] = lonsin[i0:]-cyclic
dataout[...,0:i0_shift] = datain[...,i0:]
if start:
lonsout[i0_shift:] = lonsin[start_idx:i0+start_idx]+cyclic
else:
lonsout[i0_shift:] = lonsin[start_idx:i0+start_idx]
dataout[...,i0_shift:] = datain[...,start_idx:i0+start_idx]
return dataout,lonsout
I have found the shiftgrid function of basemap
here
You can possibly call it as a separate function together with cartopy.
import numpy as np
import numpy.ma as ma
def shiftgrid(lon0,datain,lonsin,start=True,cyclic=360.0):
"""
Shift global lat/lon grid east or west.
.. tabularcolumns:: |l|L|
============== ====================================================
Arguments Description
============== ====================================================
lon0 starting longitude for shifted grid
(ending longitude if start=False). lon0 must be on
input grid (within the range of lonsin).
datain original data with longitude the right-most
dimension.
lonsin original longitudes.
============== ====================================================
.. tabularcolumns:: |l|L|
============== ====================================================
Keywords Description
============== ====================================================
start if True, lon0 represents the starting longitude
of the new grid. if False, lon0 is the ending
longitude. Default True.
cyclic width of periodic domain (default 360)
============== ====================================================
returns ``dataout,lonsout`` (data and longitudes on shifted grid).
"""
if np.fabs(lonsin[-1]-lonsin[0]-cyclic) > 1.e-4:
# Use all data instead of raise ValueError, 'cyclic point not included'
start_idx = 0
else:
# If cyclic, remove the duplicate point
start_idx = 1
if lon0 < lonsin[0] or lon0 > lonsin[-1]:
raise ValueError('lon0 outside of range of lonsin')
i0 = np.argmin(np.fabs(lonsin-lon0))
i0_shift = len(lonsin)-i0
if ma.isMA(datain):
dataout = ma.zeros(datain.shape,datain.dtype)
else:
dataout = np.zeros(datain.shape,datain.dtype)
if ma.isMA(lonsin):
lonsout = ma.zeros(lonsin.shape,lonsin.dtype)
else:
lonsout = np.zeros(lonsin.shape,lonsin.dtype)
if start:
lonsout[0:i0_shift] = lonsin[i0:]
else:
lonsout[0:i0_shift] = lonsin[i0:]-cyclic
dataout[...,0:i0_shift] = datain[...,i0:]
if start:
lonsout[i0_shift:] = lonsin[start_idx:i0+start_idx]+cyclic
else:
lonsout[i0_shift:] = lonsin[start_idx:i0+start_idx]
dataout[...,i0_shift:] = datain[...,start_idx:i0+start_idx]
return dataout,lonsout