Manifold diffusion geometry

The manifold diffusion geometry preprint is available on arxiv. The code is on GitHub.

The methods of diffusion geometry apply to general probability spaces. Manifolds are a special case with extra structure (a constant integer dimension and tangent spaces) which we can exploit to compute new quantities, like the dimension, tangent spaces, and scalar curvature.

While there are many existing approaches to differential geometry on data, it has historically been hard to implement these methods in a way that performs well statistically. Diffusion geometry lets us develop Riemannian geometry methods that are accurate and, crucially, also extremely robust to noise and low-density data.

We first compute the dimension of spaces pointwise, by diagonalising the metric tensor.

We can take the median to get a global dimension estimate. This is comparable to existing methods on perfect data but is significantly better when we add noise (increasing sigma) or change the sample density (varying n). 

The eigenvectors give an orthonormal basis for the tangent space.

Computing the tangent space like this is usually done with local PCA. The diffusion geometry tangents are comparable on perfect data, but perform significantly better when we add noise or reduce the density, and do not require parameter selection.

We then compute the curvature using the second fundamental form.

This estimate remains highly accurate even as the noise increases and density drops.

The diffusion geometry scalar curvature is comparable to the existing state-of-the-art on perfect data, but performs significantly better when we add noise or reduce the density, and do not require parameter selection.