Dividing City by Longitude and Latitude into Matrix - python

How do you divide a city such as San Francisco into equally sized blocks according to longitude and latitude coordinates? The aim of this is when I receive a coordinate of a location in the city I want to assign it automatically into one of these blocks.

Here's a simple use of the publicly available python library module Geohash (https://github.com/vinsci/geohash).
A python hashmap (dictionary) is used to map Geohash strings to lists of lat/lng points which have been added in the area ("block") represented by the Geohash. A query is performed by computing a position's Geohash and accessing the hashmap for that entry to retrieve all of the entries for that "block".
### Add position to Geohash
### Each hash mapping contains a list of points in the "block"
###
def addToGeohash(m, latitude, longitude):
p = (latitude, longitude)
ph = encode(p[0],p[1],5)
if ph not in m:
m[ph] = []
m[ph].append(p)
return
### Get positions in same "block" or empty list if none found
###
def getFromGeohash(m, latitude, longitude):
p = (latitude, longitude)
ph = encode(p[0],p[1],5)
print ("query hash: " + ph)
if ph in m:
return m[ph]
return []
### Test
m = dict()
# Add 2 points in general area (1st near vista del mar)
addToGeohash(m, 37.779914,-122.509431)
addToGeohash(m, 37.780546,-122.366189)
# Query a point 769 meters from vista del mar point
n = getFromGeohash(m, 37.779642,-122.502993)
print ("Length of hashmap: "+str(len(m)))
print ("Contents of map : "+str(m))
print ("Query result : ")
print (n)
The default precision is 12 characters (5 was used in example) and will affect the dictionary mapping efficiency or "block" size.
Note that using a lat/lng based approach is a non-linear and so over vast areas or closer to poles would not be "equal sized blocks". However,
over the area of San Fransisco and with sufficient precision, this non-linearity is reduced significantly.
Output
query hash: 9q8yu
Length of hashmap: 2
Contents of map : {'9q8yu': [(37.779914, -122.509431)], '9q8yz': [(37.780546, -122.366189)]}
Query result :
[(37.779914, -122.509431)]
To get a sense of the block size for various precisions use this link and enter for example 37.779914,-122.509431 and a precision of 5. Experiment with precision.
Here are the approximate box sizes for precisions 5-8:
5 ≤ 4.89km × 4.89km
6 ≤ 1.22km × 0.61km
7 ≤ 153m × 153m
8 ≤ 38.2m × 19.1m
An interesting feature of the Geohash which works mostly (and always with the SanFran area) is you can easily find the 8 adjacent neighbors by manipulating the last character. So, with minimal effort you could use a higher precision (e.g. 8) and reduce it to a size between 7 and 8.

the naive approach is to find a big enough rectangle around the whole city (area you want to cover) and by the number of blocks desired you can deduce to how many parts divide the rectangular edges, it should be a fairly basic math
given a point you can assign it to it's block in a very fast way (just check it's lan and long see where it falls in the grid)

I've written different versions of this algorithm over the years. Spent a lot of time mulling over the problem. There are two issues involved:
Mapping 2-dimensional coordinates into a 1-dimensional value. (Fundamentally, in order to do this optimally, you must know the bounds of both dimensions)
Cutting the plane covering a sphere roughly into "squares". (The squares are different sizes as you get closer to the poles. Also, they're never actually squares, but the earth is so huge this usually doesn't matter)
This is some code that I did that suited my purpose. A few things to consider:
This creates blocks of a set size that you designate. The python Geohash library does this cool thing where the hash can be truncated to produce hashes of larger sizes. This algorithm does not do that, but on the flip side you can specify your desired block size (roughly).
This actually creates two-dimensional coordinates that I treat as a 1-dimensional string. I do this because its good enough for me and I can easily manipulate the string data to get the adjacent blocks in a way that is clear and makes sense. For example the hash: "-23,407" is 23 blocks west and 407 blocks north of the center point. So if you want to move one block to the east, you just add 1 to -23: "-22,407".
The center point that I used here is the middle of Washington DC. You could use center point 0,0, or the middle of San Francisco or whatever. But do not use center points near the poles: -90,-180. because when the algorithm goes to calculate the longitude offset in kilometers, it will calculate the distance between (-90, your-longitude) and (-90, -180). These points are at the south pole (I think the south pole?) and the distance will be infinitesimally small because at the poles all of these blocks are extremely small.
This is the main algorithm to hash the points:
# Define center point
CENTER_LAT = 38.893
CENTER_LNG = -77.084
def geohash(lat, lng, BLOCK_SIZE_KM=.05):
# Get Latitude Offset
lat_distance = haversine_km(
(CENTER_LAT, CENTER_LNG),
(lat, CENTER_LNG)
)
if lat < CENTER_LAT:
lat_distance = lat_distance*-1
lat_offset = int(lat_distance/BLOCK_SIZE_KM)
# Get Longitude offset
lng_distance = haversine_km(
(CENTER_LAT, CENTER_LNG),
(CENTER_LAT, lng)
)
if lng < CENTER_LNG:
lng_distance = lng_distance*-1
lng_offset = int(lng_distance/BLOCK_SIZE_KM)
block_str = '%s,%s' % (lat_offset, lng_offset)
return block_str
I included these helper functions for calculating the distance between two coordinates:
def haversine_km(origin, destination):
return haversine(origin, destination, 6371)
def haversine(origin, destination, radius):
lat1, lon1 = origin
lat2, lon2 = destination
lat1 = float(lat1)
lon1 = float(lon1)
lat2 = float(lat2)
lon2 = float(lon2)
dlat = math.radians(lat2-lat1)
dlon = math.radians(lon2-lon1)
a = math.sin(dlat/2) * math.sin(dlat/2) + math.cos(math.radians(lat1)) \
* math.cos(math.radians(lat2)) * math.sin(dlon/2) * math.sin(dlon/2)
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
d = radius * c
return d

Related

Anonymizing geo location coordinates in python

I have a csv of names, transaction amount and an exact longitude and latitude of the location where the transaction was performed.
I want the final document to be anonymized - for that I need to change it into a CSV where the names are hashed (that should be easy enough), and the longitude and latitude are obscured within a radius of 2km.
I.e, changing the coordinates so they are within no more than 2 km from the original location, but in a randomized way, so that it is not revertible by a formula.
Does anyone know how to work with coordinates that way?
You could use locality sensitive hashing (LSH) to map similar co-ordinates (i.e. within a 2 KM radius), to the same value with a high probability. Hence, co-ordinates that map to the same bucket would be located closer together in Euclidean space.
Else, another technique would be to use any standard hash function y = H(x), and compute y modulo N, where N is the range of co-ordinates. Assume, your co-ordinates are P = (500,700), and you would like to return a randomized value in a range of [-x,x] KM from P.
P = (500,700)
Range = 1000 #1000 meters for example
#Anonymize co-ordinates to within specified range
ANON_X = hash(P[0]) % Range
ANON_Y = hash(P[1]) % Range
#Randomly add/subtract range
P = (P + ANON_X*random.choice([-1,1]), P+ANON_Y*random.choice([-1,1]))

Python: Calculation of mean latitude including dateline crosses

I have an array (lons) of longitude values in the range [-180, 180]. I need to find the mean of the time series. This is easily done with
np.mean(lons)
This straight forward mean, of course, doesn't work if the series contains values either side of the dateline. What is the correct way of calculating the mean for all possible cases? Note, I would rather not have a condition that treats dateline crossing cases differently.
I've played around with np.unwrap after converting from degrees to rad, but I know my calculations are wrong because a small percentage of cases are giving me mean longitudes somewhere near 0 degrees (the meridian) over Africa. These aren't possible as this is an ocean data set.
Thanks.
EDIT: I now realise a more precise way of calculating the mean [lat, lon] position of a time series might be to convert to a cartesian grid. I may go down this route.
This is an application for directional statistics, where the angular mean is computed in the complex plane (see this section). The result is a complex number, whose imaginary part represents the mean angle:
import numpy as np
def angular_mean(angles_deg):
N = len(angles_deg)
mean_c = 1.0 / N * np.sum(np.exp(1j * angles_deg * np.pi/180.0))
return np.angle(mean_c, deg=True)
lons = [
np.array([-175, -170, 170, 175]), # broad distribution
np.random.rand(1000) # narrow distribution
]
for lon in lons:
print angular_mean(lon), np.mean(lon)
As you can see, arithmetic mean and angular mean are quite similar for a narrow distribution, whereas they differ significantly for a broad distribution.
Using cartesian coordinates is not appropriate, as the center of mass will be located within the earth, but since you are using surface data I assume you want it to be located on the surface.
Here is my solution. Note that I calculate the mean latitude and longitude, but also the mean distance (mean_dist) of the [lat, lon] coordinates from the calculated mean latitude (lat_mean) and mean longitude (lon_mean). The reason is that I'm also interested in how much variation there is from the central [lat, lon]. I believe this is correct but I'm open to discussion!
lat_size = np.size(lats)
lon_rad = np.deg2rad(lons) # lons in degrees [-180, 180]
lat_rad = np.deg2rad(lats) # lats in degrees [-90, 90]
R = 6371 # Approx radius of Earth (km)
x = R * np.cos(lat_rad) * np.cos(lon_rad)
y = R * np.cos(lat_rad) * np.sin(lon_rad)
z = R * np.sin(lat_rad)
x_mean = np.mean(x)
y_mean = np.mean(y)
z_mean = np.mean(z)
lat_mean = np.rad2deg(np.arcsin(z_mean / R))
lon_mean = np.rad2deg(np.arctan2(y_mean, x_mean))
# Calculate distance from centre point for each [lat, lon] pair
dist_list = np.empty(lat_size)
dist_list.fill(np.nan)
p = 0
for lat, lon in zip(lats, lons):
coords_1 = (lat, lon)
coords_2 = (lat_mean, lon_mean )
dist_list[p] = geopy.distance.vincenty(coords_1, coords_2).km
p = p + 1
mean_dist = np.mean(dist_list)
return lat_mean, lon_mean, mean_dist

Map point to closest point on fibonacci lattice

I use the following code to generate the fibonacci lattice, see page 4 for the unit sphere. I think the code is working correctly. Next, I have a list of points (specified by latitude and longitude in radians, just as the generated fibonacci lattice points). For each of the points I want to find the index of the closest point on the fibonacci lattice. I.e. I have latitude and longitude and want to get i. How would I do this?
I specifically don't want to iterate over all the points from the lattice and find the one with minimal distance, as in practice I generate much more than just 50 points and I don't want the runtime to be O(n*m) if O(m) is possible.
FWIW, when talking about distance, I mean haversine distance.
#!/usr/bin/env python2
import math
import sys
n = 50
phi = (math.sqrt(5.0) + 1.0) / 2.0
phi_inv = phi - 1.0
ga = 2.0 * phi_inv * math.pi
for i in xrange(-n, n + 1):
longitude = ga * i
longitude = (longitude % phi) - phi if longitude < 0 else longitude % phi
latitude = math.asin(2.0 * float(i) / (2.0 * n + 1.0))
print("{}-th point: ".format(i + n + 1))
print("\tLongitude is {}".format(longitude))
print("\tLatitude is {}".format(latitude))
// Given latitude and longitude of point A, determine index i of point which is closest to A
// ???
What you are probably looking for is a spatial index: https://en.wikipedia.org/wiki/Spatial_database#Spatial_index. Since you only care about nearest neighbor search, you might want to use something relatively simple like http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.KDTree.html.
Note that spatial indexes usually consider points on a plane rather than a sphere. To adapt it to your situation, you'll probably want to split up the sphere into several regions that can be approximated by rectangles. You can then find several of the nearest neighbors according to the rectangular approximation and compute their actual haversine distances to identify the true nearest neighbor.
It's somewhat easier to use spherical coordinates here.
Your spherical coordinates are given by lat = arcsin(2 * i / (2 * N + 1)), and lon = 2 * PI * i / the golden ratio.
Reversing this is not a dead end - it's a great way to determine latitude. The issue with the reverse approach is only that it fails to represent longitude.
sin(lat) = 2 * i / (2 * N + 1)
i = (2 * N + 1) * sin(lat) / 2
This i is an exact representation of the index of a point matching the latitude of your input point. The next step is your choice - brute force, or choosing a different spiral.
The Fibonacci spiral is great at covering a sphere, but one of its properties is that it does not preserve locality between consecutive indices. Thus, if you want to find the closest points, you have to search a wide range - it is difficult to even estimate bounds for this search. Brute force is expensive. However, this is already a significant improvement over the original problem of checking every point - if you like, you can threshhold your results and bound your search in any way you like and get approximately accurate results. If you want to accomplish this in a more deterministic way, though, you'll have to dig deeper.
My solution to this problem looks a bit like this (and apologies, this is written in C# not Python)
// Take a stored index on a spiral on a sphere and convert it to a normal vector
public Vector3 UI2N(uint i)
{
double h = -1 + 2 * (i/n);
double phi = math.acos(h);
double theta = sqnpi*phi;
return new Vector3((float)(math.cos(theta) * math.sin(phi)), (float)math.cos(phi), (float)(math.sin(theta) * math.sin(phi)));
}
// Take a normalized vector and return the closest matching index on a spiral on a sphere
public uint N2UI(Vector3 v)
{
double iT = sqnpi * math.acos(v.y); // theta calculated to match latitude
double vT = math.atan2(v.z, v.x); // theta calculated to match longitude
double iTR = (iT - vT + math.PI_DBL)%(twoPi); // Remainder from iTR, preserving the coarse number of turns
double nT = iT - iTR + math.PI_DBL; // new theta, containing info from both
return (uint)math.round(n * (math.cos(nT / sqnpi) + 1) / 2);
}
Where n is the spiral's resolution, and sqnpi is sqrt(n * PI).
This is not the most efficient possible implementation, nor is it particularly clear. However, it is a middle ground where I can attempt to explain it.
The spiral I am using is one I found here:
https://web.archive.org/web/20121103201321/http://groups.google.com/group/sci.math/browse_thread/thread/983105fb1ced42c/e803d9e3e9ba3d23#e803d9e3e9ba3d23%22%22
(Orion's spiral is basically the one I'm using here)
From this I can reverse the function to get both a coarse and a fine measure of Theta (distance along the spiral), and combine them to find the best-fitting index. The way this works is that iT is cumulative, but vT is periodic. vT is a more correct measure of the longitude, but iT is a more correct measure of latitude.
I strongly encourage that anyone reading this try things other than what I'm doing with my code, as I know that it can be improved from here - that's what I did, and I would do well to do more. Using doubles is absolutely necessary here with the current implementation - otherwise too much information would be lost, particularly with the trig functions and the conversion to uint.

How to convert Longitude,Latitude, Elevation to Cartesian coordinates?

I downloaded weather data and it has longitude (in decimal), latitude (in decimal), and elevation (in m) values. There is no information about the coordinate system used. How I can convert it to cartesian coordinates ?. My attempts are below. But, my problem is to find the right formulas
def cartesian(self,longitude,latitude, elevation):
R = 6378137.0 + elevation # relative to centre of the earth
X = R * math.cos(longitude) * math.sin(latitude)
Y = R * math.sin(longitude) * math.sin(latitude)
Z = R * math.cos(latitude)
def cartesian3(self,longitude,latitude, elevation):
X = longitude * 60 * 1852 * math.cos(latitude)
Y = latitude * 60 * 1852
Z = elevation
return X,Y,Z
An answer here by Daphna Shezaf uses different formulas. However, it does not use elevations.
I would appreciate if someone could clear my confusion, should elevation be used in converting from long/lat or not ?. What are the right formulas ?. I have tried to compare the result of my codes on this website by using specific long, lat, elev. My both methods above have results that are far from the obtained result from the website
UPDATE
I would like to share the solution to my problem. I have implemented lla2ecef function from Matlab as here in python. It allows to convert radian longitude, latitude, and elevation (height in m) to cartesian. I only need to convert latitude and longitude to radian iff they are in decimal by :
latitude = (lat * math.pi) / 180 #latitude in radian, and lat in decimal
To verify my calculations. I compared the conversion result to the website above (website) and this one as well. Both give me almost same result.
Note: If you consider for simplicity earth is sphere, you can use def cartesian (I updated it; thanks to Sasha for correction). If you consider earth is ellipsoid (WGS 84 Geodetic System), you can implement the conversion as in lla2ecef. def cartesian is for cartographic projection (Thanks for rodrigo)
Elevation is measured from sealevel. Radius connects the center of the earth to you geographic location. This means that R = 6371km + elevation. This fixed point can vary and the exact value should be specified by the data provider. Your first function seems to be correct, just replace the R calculation.
To be blunt: Without radius (elevation), it is not possible to convert from spherical to cartesian coordinates. Least you could do is use the "sea level height", but this will only give you the coordinates on a planet which is a perfect sphere. Which Earth isn't.
For example, on the website you provided, you can select the ellipsoid. For WGS 84 standard, I found the following in wikipedia;
The WGS 84 datum surface is an oblate spheroid (ellipsoid) with major (equatorial) radius a = 6378137 m at the equator and flattening f = 1/298.257223563.[6] The polar semi-minor axis b then equals a times (1−f), or 6356752.3142 m

Calculate the azimuth of a line between two points in a negative coordinate system

I have a problem that I cannot seem to work out. I also cannot find a solution already given on any prior posts.
I am working in a metric coordinate system where all of the variables are negative values (example: origin = -2,-2; north = -2,-1; east = -1,-2; south = -2, -3, west = -3,-2). It's a southern hemisphere coordinate system. I need to calculate the azimuth orientation and slope of a line that passes through two points, given that the first point is the origin point.
I have been able to write a script using Python that calculates the orientations (0-360 degrees) for each pair of points, but a number of the values are 180 degrees opposite, according to a reference data set that I am comparing my results against, which already has these values calculated.
If I use ATAN2 and then convert radians to degrees does it matter which quadrant on a 2D graph the line passes through? DO I need to add or subtract 0,90,180,270, or 360 depending on the quadrant? I think this is my problem, but I am not sure.
Lastly, the above assumes that I am making the calculations for orientation and slope in 2D spaces, respectively. Is there a more parsimonious way to calculate these variables within 3D space?
I've attached my current block of code that includes the calculation of the azimuth angles per quadrant. I would really appreciate any help you all can provide.
dn = north_2 - north_1
de = east_2 - east_1
x = x + 1
if dn<0 and de<=0:
q = "q3"
theta = math.degrees(math.atan2(dn,de))
orientation = 90- theta
if dn>=0 and de <0:
q = "q4"
theta = math.degrees(math.atan2(dn,de))
orientation = 270-theta
if dn>0 and de>=0:
q = "q1"
theta = math.degrees(math.atan2(dn,de))
orientation = 270-theta
if dn<=0 and de>0:
q = "q2"
theta = math.degrees(math.atan2(dn,de))
orientation = 90-theta

Categories

Resources