20 Feb 2022, 18:28

Creating Unique Coordinate Keys

In a previous post, I examined creating a data structure for quickly getting a value based on a coordinate. For context, here a coordinate is meaning a longitude, latitude pair. This pair can often be found as an array ([0.1, 0.2] etc), and sometimes as a property on an object ({ lng: 0.1, lat: 0.2 }).

In this post, I want to dig a bit deeper and examine a new key creation approach that saves on memory and speed whilst also opening up new avenues for more simply implementing data structures such as Sets for coordinates.

Maps, Sets and Strict Equality

First a bit of background. In JavaScript two arrays are only said to be equal if they reference the same array, even if the values are the same. This means can’t use two identical coordinate arrays to exclusively associate to some piece of data. For example:

const map = new Map();
map.set([-0.118092, 51.509865], "a");
map.set([-0.118092, 51.509865], "b");

// For those familiar with JS Map and Set behaviour,
// the following may not be a suprise:

map.get([-0.118092, 51.509865]); // Returns undefined
map.size; // Return 2, as each array creates it's own entry

Now you might be able to see that in some use cases we might think of a coordinate as a unique key and not be concerned about object equality - we just care that the values match. In some languages, tuples of a set length with the same values can be considered equal, and this is the same logic we might want to apply here.

What about toString?

The second problem it gets around is perhaps how one might intuatively try to solve this which is to use toString:

const map = new Map();
const key = [-0.118092, 51.509865].toString();
map.set(key, "a");
map.set(key, "b");

map.get([-0.118092, 51.509865]); // Returns "b"
map.size; // Return 1, as each array creates it's own entry

This gets us closer to what we want and could have been one way to achieve what we wanted in the original blog post. However, unfortunately toString is a somewhat slow operation in JavaScript.

The approach in the original post gets around both of these issues by using nested Maps, so setting in the CoordinateMap creates a Map for the longitude value and then another nested one for the latitude value. This works, but in the worst case means you end up with two Maps for each coordinate. Understandably this could be viewed as excessive.

Is there a fast way to create a unique key?

I’ve been experimenting with a fast way to create a key - the logic works like this: in Mathematics, there is a concept of a pairing function that allows you to uniquely map two natural numbers to a single larger natural number. In our case, we can think of natural numbers as positive integers. This got me thinking, if we can encode our coordinates as positive integers we can use one of the pairing functions to create a single key. If we work on the assumption that we can pin the coordinates to a maximum precision, say a very generous 9 decimals, then thankfully we can fit the numbers into a single digit (JavaScripts maximum integer size is a whopping 9007199254740991 as specified by IEEE 754).

Using Szudzik’s Elegant pairing function we can achieve this doing the following:

export function pairCoordinate(
  longitude: number,
  latitude: number,
  factor = 10000000 // 9 digit precision
) {
  // Make longitude and latitude postive
  // Ensure they are integers
  const x = ((longitude + 180) * factor) | 0;
  const y = ((latitude + 90) * factor) | 0;

  // Szudzik pairing
  return x >= y ? x * x + x + y : y * y + x;
}

This approach for generating unique keys for a coordinate is about 45x faster than using toString. In a future post I will examine how we can reimplement at CoordinateMap and also look at implementing a CoordinateSet to quickly check for presence of a given coordinate.

13 Feb 2022, 18:28

Coordinate Maps and Sets

One problem I was pondering recently is in the geospatial arena - somewhere I spend a lot of my time working. I’ve noticed that it’s relatively easy to do key-value lookups on a single piece of data, i.e. a string or a number. For example, if you want to associate a letter in the alphabet to a score you could use a Map to do something like this:

const map = new Map(["a", 1], ["b", 2], ["c", 3], ["d", 4], ["e", 5], ["f", 6]); //etc

The problem I have been mulling over is if you have a piece of data which is unique but multidemnsional, like a coordinate, how do you handle that? For example this approach now falls apart:

coordinateMap = new Map();
coordinateMap.set([32.95, 83.31], 1);
coordinateMap.set([32.95, 83.31], 2);
coordinateMap.size; // This will return 2, as each array creates it's own entry

Assuming you have a coordinate in the format [number, number], you could simply convert the coordinate to a string using the toString array method. This gives you a unique key for that piece of data:

const coordinateMap = new Map();
const coordinate = [32.95, 83.31];
coordinateMap.set(coordinate.toString(), "some value");

It turns out, the Array toString method is quite slow. However using string concatenation/template strings is quite fast, as fast as using some sort of number pairing function, and as such, we can use something like this:

`${coordinates[0]},${coordinates[1]}`;

Coordinate Map

This got me thinking, can we come up with a data structure that handles this problem elegantly, with a nice Map-like API? This is where I got to:

import { CoordinateDataStructure } from "./coordinate-data-structure";

export type NotUndefined<T> = T extends undefined ? never : T;

export class CoordinateMap<
  T extends U extends any ? NotUndefined<T> : unknown,
  U = any
> extends CoordinateDataStructure {
  private _size = 0;
  private _map: Map<string, T> = new Map();

  // O(1)
  get size() {
    return this._size;
  }

  // Ignore setting the size property publically
  set size(value: number) {}

  // O(1)
  set(coordinates: [number, number], value: T): void {
    this.validateCoordinate(coordinates);

    if (value === undefined) {
      throw new Error("Values can not be set to undefined - use 'remove'");
    }

    const key = this.getKey(coordinates);

    if (!this._map.get(key)) {
      this._size++;
    }

    this._map.set(key, value);
  }

  // O(1)
  get(coordinates: [number, number]): T | undefined {
    this.validateCoordinate(coordinates);
    const key = this.getKey(coordinates);
    return this._map.get(key);
  }

  // O(1)
  delete(coordinates: [number, number]): void {
    this.validateCoordinate(coordinates);
    const key = this.getKey(coordinates);

    if (this._map.get(key) === undefined) {
      throw new Error(
        `Coordinate [${key}] cannot be removed as is not in CoordinateMap`
      );
    }

    this._size--;
    this._map.delete(key);
  }

  // O(1)
  has(coordinates: [number, number]): boolean {
    this.validateCoordinate(coordinates);
    return this._map.get(this.getKey(coordinates)) !== undefined;
  }

  clear() {
    this._map.clear();
    this._size = 0;
  }
}

Here validateCoordinate and getKey comes from a base abstract class which we share with the Set data structure which is outlined below. For completeness, it looks like so:

export abstract class CoordinateDataStructure {
  protected getKey(coordinates: [number, number]) {
    return `${coordinates[0]},${coordinates[1]}`;
  }

  protected validateCoordinate([lng, lat]: [number, number]) {
    if (
      !(typeof lng === "number" && !isNaN(lng) && lng >= -180 && lng <= 180) ||
      !(typeof lat === "number" && !isNaN(lat) && lat >= -90 && lat <= 90)
    ) {
      throw new Error("Longitude and latitude must be valid");
    }
  }
}

Designing the Map API was a fun little challenge - in the end, I tried to match the classic JavaScript Map API as closely as possible.

To demonstrate, let’s say we’re in a testing environment like Jest/Mocha etc, here I can demonstrate the API of the class:

const coordinateMap = new CoordinateMap<number>();

expect(coordinateMap.get([0.0, 1.0])).toBe(undefined);
expect(coordinateMap.has([0.0, 1.0])).toBe(false);

coordinateMap.set([0.0, 1.0], 1);

expect(coordinateMap.get([0.0, 1.0])).toBe(1);
expect(coordinateMap.has([0.0, 1.0])).toBe(true);

coordinateMap.delete([0.0, 1.0]);

expect(coordinateMap.get([0.0, 1.0])).toBe(undefined);
expect(coordinateMap.has([0.0, 1.0])).toBe(false);

Coordinate Set

I also went ahead and implemented an equivalent data structure for the Set data structure:

import { CoordinateDataStructure } from "./coordinate-data-structure";

export class CoordinateSet extends CoordinateDataStructure {
  private _size: number;
  private map: Map<string, [number, number]>;

  constructor(array?: [number, number][]) {
    super();

    if (Array.isArray(array)) {
      this.map = new Map();
      arrOrCoordSet.forEach((c) => {
        this.validateCoordinate(c);
        this.map.set(this.getKey(c), c);
      });
      this._size = arrOrCoordSet.length;
    } else {
      this.map = new Map();
      this._size = 0;
    }
  }

  // O(1)
  get size() {
    return this._size;
  }

  // Ignore setting the size property publically
  set size(value: number) {}

  has(coordinate: [number, number]) {
    return Boolean(this.map.get(this.getKey(coordinate)));
  }

  add(coordinate: [number, number]) {
    const key = this.getKey(coordinate);
    if (this.map.get(key)) {
      return;
    }
    this._size++;
    this.map.set(key, coordinate);
    return this;
  }

  delete(coordinate: [number, number]): boolean {
    const key = this.getKey(coordinate);
    if (!this.map.get(key)) {
      return false;
    }
    this._size--;
    return this.map.delete(key);
  }

  clear() {
    this.map = new Map();
    this._size = 0;
  }
}

This data structure allows you to do the core JavaScript Set operations, like so:

const set = new CoordinateSet();

set.add([180, -90]);
set.add([0.1, 0.2]);
set.delete([180, -90]);
expect(set.size).toBe(1);
set.delete([0.1, 0.2]);
expect(set.size).toBe(0);

A few things could be done to probably improve these data structures, for example adding all the iterable methods such as entries and values. I may extend this at some point and publish it on a library on npm.

31 Jan 2021, 21:28

Measuring the World with JavaScript

If I asked you to measure two points on a table, you might get out a ruler or a tape measure to tell me the distance between the two. However, what if I asked you to measure the points between two capital cities? Suddenly it becomes more complicated, not only just logistically but also in terms of assumptions we have to make about that measurement.

One big change, apart from just scale, is that we now have to account for the roundness of the world. Thankfully mathematicians who were far smarter than me have come up with formulas that can account for this in various ways. One such formula is the haversine formula. This is credited as far back as 1801 to Spanish astronomer and mathematician José de Mendoza y Ríos.

The Haversine formula is useful in a lot of spherical trigonometry problems, as it allows a person to determine what is called a great-circle distance between two given points on a sphere. In particular relevance to what we are looking at today, it is useful for determining the approximate distance between two latitude and longitudes on the globe.

When it comes to translating this to code Chris Veness, a UK based programmer has written a healthy amount of code in the space - a lot of it is featured in the popular geospatial library Turf.js. Here’s a stand-alone adaptation of some of his code for the Haversine formula, written in TypeScript:

export function haversineDistance(
  pointOne: { lng: number; lat: number },
  pointTwo: { lng: number; lat: number }
) {
  const toRadians = (latOrLng: number) => (latOrLng * Math.PI) / 180;

  const phiOne = toRadians(pointOne.lat);
  const lambdaOne = toRadians(pointOne.lng);
  const phiTwo = toRadians(pointTwo.lat);
  const lambdaTwo = toRadians(pointTwo.lng);
  const deltaPhi = phiTwo - phiOne;
  const deltalambda = lambdaTwo - lambdaOne;

  const a =
    Math.sin(deltaPhi / 2) * Math.sin(deltaPhi / 2) +
    Math.cos(phiOne) *
      Math.cos(phiTwo) *
      Math.sin(deltalambda / 2) *
      Math.sin(deltalambda / 2);
  const c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));

  const radius = 6371e3;
  const distance = radius * c;

  return distance;
}

The Haversine formula suffers from one core problem, which is that it assumes that the world is round. Unfortunately for us, the world is not actually round, it’s closer to what we’d determine as an oblate spheroid; a sphere that is flattened at the poles. As the Haversine makes this assumption, it cannot be guaranteed correct to provide results with better than 0.5% accuracy.

This is where another formula comes in, Vincenty’s inverse formula. The formula comes from Thaddeus Vincenty, a Polish American geodesist who was with the U.S. Air Force. Vincenty’s inverse formula works with the idea that the Earth is an oblate-spheroid. This approach is more complex and is also iteration based, unlike the haversine formula.

Again, I’ve taken some of Chris’s code and made it standalone (you can just copy and paste this function). It works on the WGS84 ellipsoid, as this is probably the most common use case. Lets take a look at the code, again in TypeScript:

export function inverseVincentyDistance(
  pointOne: { lng: number; lat: number },
  pointTwo: { lng: number; lat: number }
): number {
  const toRadians = (latOrLng: number) => (latOrLng * Math.PI) / 180;

  const phiOne = toRadians(pointOne.lat);
  const lambda1 = toRadians(pointOne.lng);
  const phiTwo = toRadians(pointTwo.lat);
  const lambda2 = toRadians(pointTwo.lng);

  const wgs84ellipsoid = {
    a: 6378137,
    b: 6356752.314245,
    f: 1 / 298.257223563,
  };
  const { a, b, f } = wgs84ellipsoid;

  const L = lambda2 - lambda1; // L = difference in longitude, U = reduced latitude, defined by tan U = (1-f)·tanphi.
  const tanU1 = (1 - f) * Math.tan(phiOne),
    cosU1 = 1 / Math.sqrt(1 + tanU1 * tanU1),
    sinU1 = tanU1 * cosU1;
  const tanU2 = (1 - f) * Math.tan(phiTwo),
    cosU2 = 1 / Math.sqrt(1 + tanU2 * tanU2),
    sinU2 = tanU2 * cosU2;

  const antipodal =
    Math.abs(L) > Math.PI / 2 || Math.abs(phiTwo - phiOne) > Math.PI / 2;

  let lambda = L,
    sinLambda = null,
    cosLambda = null; // lambda = difference in longitude on an auxiliary sphere
  let sigma = antipodal ? Math.PI : 0,
    sinSigma = 0,
    cosSigma = antipodal ? -1 : 1,
    sinSqsigma = null; // sigma = angular distance P₁ P₂ on the sphere
  let cos2sigmaM = 1; // sigmaM = angular distance on the sphere from the equator to the midpoint of the line
  let sinalpha = null,
    cosSqAlpha = 1; // alpha = azimuth of the geodesic at the equator
  let C = null;

  let lambdaʹ = null,
    iterations = 0;
  do {
    sinLambda = Math.sin(lambda);
    cosLambda = Math.cos(lambda);
    sinSqsigma =
      cosU2 * sinLambda * (cosU2 * sinLambda) +
      (cosU1 * sinU2 - sinU1 * cosU2 * cosLambda) *
        (cosU1 * sinU2 - sinU1 * cosU2 * cosLambda);

    if (Math.abs(sinSqsigma) < Number.EPSILON) {
      break; // co-incident/antipodal points (falls back on lambda/sigma = L)
    }

    sinSigma = Math.sqrt(sinSqsigma);
    cosSigma = sinU1 * sinU2 + cosU1 * cosU2 * cosLambda;
    sigma = Math.atan2(sinSigma, cosSigma);
    sinalpha = (cosU1 * cosU2 * sinLambda) / sinSigma;
    cosSqAlpha = 1 - sinalpha * sinalpha;
    cos2sigmaM =
      cosSqAlpha != 0 ? cosSigma - (2 * sinU1 * sinU2) / cosSqAlpha : 0; // on equatorial line cos²alpha = 0 (§6)
    C = (f / 16) * cosSqAlpha * (4 + f * (4 - 3 * cosSqAlpha));
    lambdaʹ = lambda;
    lambda =
      L +
      (1 - C) *
        f *
        sinalpha *
        (sigma +
          C *
            sinSigma *
            (cos2sigmaM + C * cosSigma * (-1 + 2 * cos2sigmaM * cos2sigmaM)));
    const iterationCheck = antipodal
      ? Math.abs(lambda) - Math.PI
      : Math.abs(lambda);
    if (iterationCheck > Math.PI) {
      throw new Error("lambda > Math.PI");
    }
  } while (Math.abs(lambda - lambdaʹ) > 1e-12 && ++iterations < 1000);
  if (iterations >= 1000) {
    throw new Error("Vincenty formula failed to converge");
  }

  const uSq = (cosSqAlpha * (a * a - b * b)) / (b * b);
  const A = 1 + (uSq / 16384) * (4096 + uSq * (-768 + uSq * (320 - 175 * uSq)));
  const B = (uSq / 1024) * (256 + uSq * (-128 + uSq * (74 - 47 * uSq)));
  const deltaSigma =
    B *
    sinSigma *
    (cos2sigmaM +
      (B / 4) *
        (cosSigma * (-1 + 2 * cos2sigmaM * cos2sigmaM) -
          (B / 6) *
            cos2sigmaM *
            (-3 + 4 * sinSigma * sinSigma) *
            (-3 + 4 * cos2sigmaM * cos2sigmaM)));

  const distance = b * A * (sigma - deltaSigma); // distance = length of the geodesic

  return distance;
}

I’ll be honest, Vincenty’s formula is a bit lost on me, but it promises to provide more accurate results! In fact, its use is very common in geodesy because the expected accuracy is to 0.5mm on the Earth ellipsoid. Pretty cool right? The one downside is that it is computationally slower due to the increased number of operations and the iterative approach.

Hopefully, you’ve enjoyed this short blog post about how to measure distances on the Earth with JavaScript (…okay TypeScript). Let me know if you would be interested in follow-ups on similar topics!