+Statistical sex differences are like flipping two different collections of coins with different biases, where the coins represent various traits. Learning the outcome of any individual flip, doesn't tell you which which set the coin came from, but [if we look at the aggregation of many flips, we can get _godlike_ confidence](https://www.lesswrong.com/posts/cu7YY7WdgJBs3DpmJ/the-univariate-fallacy-1) as to which collection we're looking at.
+
+A single-variable measurement like height is like a single coin: unless the coin is _very_ biased, one flip can't tell you much about the bias. But there are lots of things about people for which it's not that they can't be measured, but that the measurements require _more than one number_—which correspondingly offer more information about the distribution generating them.
+
+[TODO (somewhere around-ish this section): chromosomes at the root of the causal graph: https://www.lesswrong.com/posts/hzuSDMx7pd2uxFc5w/causal-diagrams-and-causal-models ]
+
+Take faces. People are [verifiably very good at recognizing sex from (hair covered, males clean-shaven) photographs of people's faces](/papers/bruce_et_al-sex_discrimination_how_do_we_tell.pdf) (96% accuracy, which is the equivalent of _d_ ≈ 3.5), but we don't have direct introspective access into what _specific_ features our brains are using to do it; we just look, and _somehow_ know. The differences are real, but it's not a matter of any single, simple measurement you could perform with a ruler (like the distance between someone's eyes). Rather, it's a high-dimensional _pattern_ in many measurements you could take with a ruler, no one of which is definitive. [Covering up the nose makes people slower and slightly worse at sexing faces, but people don't do better than chance at guessing sex from photos of noses alone](/papers/roberts-bruce-feature_saliency_in_judging_the_sex_and_familiarity_of_faces.pdf).