A recent paper proves something that runners have long suspected: GPS overestimates the distance you have traveled. This isn’t due to any algorithmic error; it is instead an unavoidable consequence of two facts:
- The position measurements that GPS makes are noisy — there is some degree of random error to them.
- The distance between two points is a convex function of the coordinates of the points.
A convex function is one that curves upwards. Here are some examples:
For a function of one argument (such as the above examples), convexity means that the function has a positive second derivative. A convex function f of several arguments x_1, \ldots, x_n curves upward no matter what direction you follow; that is, the directional second derivative is positive no matter what direction you choose.
Jensen’s Inequality states that
- if f is a convex function
- and x is a (possibly vector-valued) random variable
then
E[f(x)] > f(E[x]).(Strictly speaking, you could have = instead of >, but only if the probability distribution for x is concentrated at a single point.)
In this case, x is the vector (x_1,y_1,x_2,y_2), where (x_1,y_1) are the measured GPS coordinates for the starting point and (x_2,y_2) are the measured GPS coordinates for the ending point, and
f(x) = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2}is the calculated distance between the two points. It is straightforward to show that this distance function f(x) is convex.
Note that (x_1,y_1) and (x_2,y_2) are noisy measurements, not the actual (imperfectly known) coordinates. If we assume that the GPS measurements, although noisy, are at least unbiased, then
E\left[\left(x_i,y_i\right)\right] = \left(x^*_i, y^*_i\right)where (x^*_1,y^*_1) and (x^*_2,y^*_2) are the actual coordinates. The calculated distance is f(x), the actual distance is f(x^*), and Jensen’s inequality guarantees that
E[f(x)] > f(x^*).