-The typical point in a high-dimensional distribution is _not_ typical along each dimension individually. [In 100 flips of a biased coin](http://zackmdavis.net/blog/2019/05/the-typical-set/) that lands Heads 0.6 of the time, the _single_ most likely sequence is 100 Heads, but there's only one of those and you're _vanishingly_ unlikely to actually see it. The [sequences you'll actually observe will have close to 60 Heads](https://en.wikipedia.org/wiki/Asymptotic_equipartition_property). Each such sequence is individually less probable than the all-Heads sequence, but there are vastly more of them. Similarly, [most of the probability-mass of a high-dimensional multivariate normal distribution is concentrated in a thin "shell" some distance away from the mode](https://www.johndcook.com/blog/2011/09/01/multivariate-normal-shell/), for the same reason. (The _same_ reason: the binomial distribution converges to the normal in the limit of large _n_.)
+<a id="typical-point"></a>The typical point in a high-dimensional distribution is _not_ typical along each dimension individually. [In 100 flips of a biased coin](http://zackmdavis.net/blog/2019/05/the-typical-set/) that lands Heads 0.6 of the time, the _single_ most likely sequence is 100 Heads, but there's only one of those and you're _vanishingly_ unlikely to actually see it. The [sequences you'll actually observe will have close to 60 Heads](https://en.wikipedia.org/wiki/Asymptotic_equipartition_property). Each such sequence is individually less probable than the all-Heads sequence, but there are vastly more of them. Similarly, [most of the probability-mass of a high-dimensional multivariate normal distribution is concentrated in a thin "shell" some distance away from the mode](https://www.johndcook.com/blog/2011/09/01/multivariate-normal-shell/), for the same reason. (The _same_ reason: the binomial distribution converges to the normal in the limit of large _n_.)