Tags: Charles Murray, review (book), intelligence, race, sex differences, Emacs, politics, probability, topology, COVID-19
Status: draft
-[This is a pretty good book](https://www.twelvebooks.com/titles/charles-murray/human-diversity/9781538744000/) about things we know about some ways in which people are different from each other! In [my last book review](/2020/Jan/book-review-the-origins-of-unfairness/), I mentioned that I had been thinking about broadening the topic scope of this blog, and this book review seems like an okay place to start!
+[This is a pretty good book](https://www.twelvebooks.com/titles/charles-murray/human-diversity/9781538744000/) about things we know about some ways in which people are different from each other, including differences in _cognitive repertoires_ (author Charles Murray's choice of phrase for saving nine syllables contrasted to "personality, abilities, and social behavior"). In [my last book review](/2020/Jan/book-review-the-origins-of-unfairness/), I mentioned that I had been thinking about broadening the topic scope of this blog, and this book review seems like an okay place to start!
Honestly, I feel like I already knew most of this stuff?—sex differences in particular are kind of _my bag_—but many of the details were new to me, and it's nice to have it all bundled together in a paper book with lots of citations that I can chase down later when I'm skeptical or want more details about a specific thing! The main text is littered with pleonastic constructions like "The first author was Jane Thisand-Such" (when discussing the results of a multi-author paper) or "Details are given in the note<sup>[n]</sup>", which feel clunky to read, but are _so much better_ than the all-too-common alternative of authors _not_ "showing their work".
_Human Diversity_ is divided into three parts corresponding to the topics in the subtitle! (Plus another part if you want some wrapping-up commentary from Murray.) So the first part is about things we know about some ways in which female people and male people are different from each other!
-The first (short) chapter is mostly about explaining [Cohen's _d_](https://en.wikiversity.org/wiki/Cohen%27s_d) [effect sizes](https://en.wikipedia.org/wiki/Effect_size), which I think are solving a very important problem! When people say "Men are taller than women" you know they don't mean _all_ men are taller than _all_ women (because you know that they know that that's obviously not true), but that just raises the question of what they _do_ mean. Saying they mean it "generally", "on average", or "statistically" doesn't really solve the problem, because that covers everything between-but-not-including "No difference" to "Yes, literally all women and all men". Cohen's _d_ is the summary statistic that lets us _quantify_ statistical differences in standardized form: once you can [visualize the overlapping distributions](https://rpsychologist.com/d3/cohend/), whether the reality of the data should be summarized in English words as a "large difference" or a "small difference" becomes a _much less interesting_ question.
+The first (short) chapter is mostly about explaining [Cohen's _d_](https://en.wikiversity.org/wiki/Cohen%27s_d) [effect sizes](https://en.wikipedia.org/wiki/Effect_size), which I think are solving a very important problem! When people say "Men are taller than women" you know they don't mean _all_ men are taller than _all_ women (because you know that they know that that's obviously not true), but that just raises the question of what they _do_ mean. Saying they mean it "generally", "on average", or "statistically" doesn't really solve the problem, because that covers everything between-but-not-including "No difference" to "Yes, literally all women and all men". Cohen's _d_—the difference between two groups' means in terms of their pooled standard deviation—lets us give a _quantitative_ answer to _how much_ men are taller than women: I've seen reports of _d_ ≈ 1.4–1.7 depending on the source, a lot smaller than the sex difference in murder rates (_d_ ≈ 2.5), but much bigger than the difference in verbal skills (_d_ ≈ 0.3, favoring women).
-Murray also addresses the issue of aggregating effect sizes—something [I've been meaning to get around to blogging about](/2018/Dec/untitled-metablogging-26-december-2018/#high-dimensional-social-science-and-the-conjunction-of-small-effect-sizes) more exhaustively for a while in this context of group differences (although at least, um, my favorite author on _Less Wrong_ [covered it in the purely abstract setting](https://www.lesswrong.com/posts/cu7YY7WdgJBs3DpmJ/the-univariate-fallacy)): small effect sizes in any single measurement can amount to a _big_ difference when you're considering many measurements at once. That's how people can [distinguish female and male faces at 96% accuracy](http://unremediatedgender.space/papers/bruce_et_al-sex_discrimination_how_do_we_tell.pdf), even though there's no single measurement (like "eye width" or "nose height") offers that much predictive power.
+If you have a quantitative effect size, then you can [visualize the overlapping distributions](https://rpsychologist.com/d3/cohend/), and the question of whether the reality of the data should be summarized in English as a "large difference" or a "small difference" becomes _much less interesting_, bordering on meaningless.
-[TODO: more examples of sex difference effect sizes, elaborate on "big" doesn't mean anything]
+Murray also addresses the issue of aggregating effect sizes—something [I've been meaning to get around to blogging about](/2018/Dec/untitled-metablogging-26-december-2018/#high-dimensional-social-science-and-the-conjunction-of-small-effect-sizes) more exhaustively for a while in this context of group differences (although at least, um, my favorite author on _Less Wrong_ [covered it in the purely abstract setting](https://www.lesswrong.com/posts/cu7YY7WdgJBs3DpmJ/the-univariate-fallacy)): small effect sizes in any single measurement can amount to a _big_ difference when you're considering many measurements at once. That's how people can [distinguish female and male faces at 96% accuracy](http://unremediatedgender.space/papers/bruce_et_al-sex_discrimination_how_do_we_tell.pdf), even though there's no single measurement (like "eye width" or "nose height") offers that much predictive power.
Subsequent chapers address sex differences in personality, cognition, interests, and the brain. It turns out that women are more warm, empathetic, æsthetically discerning, and cooperative than men are! They're also more into the Conventional, Artistic, and Social dimensions of the [Holland occupational-interests model](https://en.wikipedia.org/wiki/Holland_Codes).
-You might think that this is all due to socialization, but then it's hard to explain why the same differences show up in different cultures—and why (counterintuitively) the differences seem _larger_ in richer, more feminist countries. (Although as evolutionary anthropologist [William Buckner](https://traditionsofconflict.com/) points out in [his](https://twitter.com/Evolving_Moloch/status/1228124441944584192) [social-media](https://twitter.com/Evolving_Moloch/status/1228860328483491840) [criticism](https://twitter.com/Evolving_Moloch/status/1228947493309698050) of _Human Diversity_, [W.E.I.R.D.](https://www.apa.org/monitor/2010/05/weird) samples from different countries aren't capturing the full range of human cultures.) You might think that the "larger differences in rich countries" result is an artifact: maybe people in less-feminist countries implicitly make within-sex comparisons when answering personality questions (_e.g._, "I'm competitive _for a woman_") whereas people in more-feminist countries use a less sexist standard of comparison, construing ratings as compared to people-in-general. Murray points out that this explanation still posits the existence of large sex differences in rich countries (while explaining away the unexpected cross-cultural difference-in-differences). Another possibility is that wealth increases sexual dimorphism _in general_, including, _e.g._, height and blood pressure, not just in personality.
-
-[TODO: tie into farmer/forager theory: http://www.overcomingbias.com/2010/10/divide-forager-v-farmer.html ]
+You might think that this is all due to socialization, but then it's hard to explain why the same differences show up in different cultures—and why (counterintuitively) the differences seem _larger_ in richer, more feminist countries. (Although as evolutionary anthropologist [William Buckner](https://traditionsofconflict.com/) points out in [his](https://twitter.com/Evolving_Moloch/status/1228124441944584192) [social-media](https://twitter.com/Evolving_Moloch/status/1228860328483491840) [criticism](https://twitter.com/Evolving_Moloch/status/1228947493309698050) of _Human Diversity_, [W.E.I.R.D.](https://www.apa.org/monitor/2010/05/weird) samples from different countries aren't capturing the full range of human cultures.) You might think that the "larger differences in rich countries" result is an artifact: maybe people in less-feminist countries implicitly make within-sex comparisons when answering personality questions (_e.g._, "I'm competitive _for a woman_") whereas people in more-feminist countries use a less sexist standard of comparison, construing ratings as compared to people-in-general. Murray points out that this explanation still posits the existence of large sex differences in rich countries (while explaining away the unexpected cross-cultural difference-in-differences). Another possibility is that sexual dimorphism _in general_ increases with wealth, including, _e.g._, in height and blood pressure, not just in personality. (I notice that this is consilient with the view that [agriculture was a mistake](https://www.discovermagazine.com/planet-earth/the-worst-mistake-in-the-history-of-the-human-race) that suppresses humans' natural tendencies, and that people [revert to forager-like lifestyles](http://www.overcomingbias.com/2010/10/divide-forager-v-farmer.html) [in many ways](http://www.overcomingbias.com/2017/08/forager-v-farmer-elaborated.html) as the riches of the industrial revolution let them afford it.)
Women are better at verbal ability and social cognition, whereas [men are better at visuospatial skills](http://zackmdavis.net/blog/2016/12/alpha-gamma-phi/). The sexes achieve similar levels of overall performance via somewhat different mental "toolkits." Murray devotes a section to a 2007 result of Johnson and Bouchard, who report that general intelligence ["masks the dimensions on which [sex differences in mental abilities] lie"](/papers/johnson-bouchard-sex_differences_in_mental_abilities_g_masks_the_dimensions.pdf): overall levels of mental well-functioning lead to underestimates of the effect sizes of specific mental abilities, which you want to statistically correct for. This result in particular is _super gratifying_ to me personally, because [I independently had a very similar idea a few months back](/2019/Sep/does-general-intelligence-deflate-standardized-effect-sizes-of-cognitive-sex-differences/)—it's _super validating_ as an amateur to find that the pros have been thinking along the same track!
The curmudgeonly view epitomized by Turkheimer says that Science is about understanding the _causal structure_ of phenomena, and that polygenic scores don't fucking tell us anything. [Marital status is heritable _in the same way_ that intelligence is heritable](http://www.geneticshumanagency.org/gha/the-ubiquity-problem-for-group-differences-in-behavior/), not because there are "divorce genes" in any meaningful biological sense, but because of a "universal, nonspecific genetic pull on everything": on average, people with more similar genes will make more similar proteins from those similar genes, and therefore end up with more similar phenotypes that interact with the environment in a more similar way, and _eventually_ (the causality flowing "upwards" through many hierarchical levels of organization) this shows up in the divorce statistics of a particular Society in a particular place and time. But this is opaque and banal; the real work of Science is in figuring out what all the particular gene variations actually _do_.
-Notably, Plomin and Turkheimer aren't actually disagreeing here: it's a difference in emphasis rather than facts. Polygenic scores _don't_ explain mechanisms—but might they end up being useful, and used, anyway? Murray's vision of social science is content to make predictions and "explain variance" while remaining ignorant of ultimate causality. Murray compares polygenic scores to "economic indexes predicting GDP growth", which is not necessarily a reassuring analogy to those who doubt how much of GDP represents real production rather than the "exhaust heat" of zero-sum contests in an environment of [manufactured scarcity](http://benjaminrosshoffman.com/there-is-a-war/) and [artificial demand](https://write.as/harold-lee/the-sliding-scale-of-bullshit-jobs).
-
-Meanwhile, my cursory understanding (while kicking myself for [_still_](/2018/Dec/untitled-metablogging-26-december-2018/#daphne-koller-and-the-methods) not having put in the hours to get much farther into [_Probabilistic Graphical Models: Principles and Techniques_](https://mitpress.mit.edu/books/probabilistic-graphical-models)) was that you _need_ to understand causality in order to predict what interventions will have what effects: variance in rain may be statistically "explained by" variance in mud puddles, but you can't make it rain by turning the hose on. Maybe our feeble state of knowledge is _why_ we don't know how to find reliable large-effect environmental interventions that still yet might exist in the vastness of the space of possible interventions.
+Notably, Plomin and Turkheimer aren't actually disagreeing here: it's a difference in emphasis rather than facts. Polygenic scores _don't_ explain mechanisms—but might they end up being useful, and used, anyway? Murray's vision of social science is content to make predictions and "explain variance" while remaining ignorant of ultimate causality. (Murray compares polygenic scores to "economic indexes predicting GDP growth", which is not necessarily a reassuring analogy to those who doubt how much of GDP represents real production rather than the "exhaust heat" of zero-sum contests in an environment of [manufactured scarcity](http://benjaminrosshoffman.com/there-is-a-war/) and [artificial demand](https://write.as/harold-lee/the-sliding-scale-of-bullshit-jobs).) Meanwhile, my cursory understanding (while kicking myself for [_still_](/2018/Dec/untitled-metablogging-26-december-2018/#daphne-koller-and-the-methods) not having put in the hours to get much farther into [_Probabilistic Graphical Models: Principles and Techniques_](https://mitpress.mit.edu/books/probabilistic-graphical-models)) was that you _need_ to understand causality in order to predict what interventions will have what effects: variance in rain may be statistically "explained by" variance in mud puddles, but you can't make it rain by turning the hose on. Maybe our feeble state of knowledge is _why_ we don't know how to find reliable large-effect environmental interventions that still yet might exist in the vastness of the space of possible interventions.
There are also some appendicies at the back of the book! Appendix 1 (reproduced from, um, one of Murray's earlier books with a coauthor) explains some basic statistics concepts. Appendix 2 ("Sexual Dimorphism in Humans") goes over the prevalence of intersex conditions and gays, and then—so much for this post broadening the [topic scope of this blog](/tag/two-type-taxonomy/)—transgender typology! Murray presents the Blanchard–Bailey–Lawrence–Littman view as fact, which I think is basically _correct_, but a more comprehensive treatment (which I concede may be too much too hope for from a mere Appendix) would have at least _mentioned_ alternative views ([Serano](https://rationalwiki.org/wiki/Intrinsic_Inclinations_Model)? [Veale](/papers/veale-lomax-clarke-identity_defense_model.pdf)?), if only to explain _why_ they're worth dismissing. (Contrast to the eight pages in the main text explaining why "But, but, epigenetics!" is worth dismissing.) Then Appendix 3 ("Sex Differences in Brain Volumes and Variance") has tables of brain-size data, and an explanation of the greater-male-variance hypothesis. Cool!
In 1994's _The Bell Curve: Intelligence and Class Structure in American Life_, Murray and coauthor Richard J. Herrnstein argued that a lot of variation in life outcomes is explained by variation in intelligence. Some people think that folk concepts of "intelligence" or being "smart" are ill-defined and therefore not a proper object of scientific study. But that hasn't stopped some psychologists from trying to construct tests purporting to measure an "intelligence quotient" (or _IQ_ for short). It turns out that if you give people a bunch of different mental tests, the results all positively correlate with each other: people who are good at one mental task, like listening to a list of numbers and repeating them backwards ("reverse digit span"), are also good at others, like knowing what words mean ("vocabulary"). There's a lot of fancy linear algebra involved, but basically, you can visualize people's test results as a hyper[ellipsoid](https://en.wikipedia.org/wiki/Ellipsoid) in some high-dimensional space where the dimensions are the different tests. (I rely on this ["configuration space"](https://www.lesswrong.com/posts/WBw8dDkAWohFjWQSk/the-cluster-structure-of-thingspace) visual metaphor _so much_ for _so many_ things that when I started [my secret ("secret") gender blog](/), it felt right to put it under a `.space` [TLD](https://en.wikipedia.org/wiki/Top-level_domain).) The longest axis of the hyperellipsoid corresponds to the "_g_ factor" of "general" intelligence—the choice of axis that cuts through the most variance in mental abilities.
-It's important not to overinterpret the _g_ factor as some unitary essence of intelligence rather than the length of a hyperellipsoid. It seems likely that [if you gave people a bunch of _physical_ tests, they would positively correlate with each other](https://www.talyarkoni.org/blog/2010/03/07/what-the-general-factor-of-intelligence-is-and-isnt-or-why-intuitive-unitarianism-is-a-lousy-guide-to-the-neurobiology-of-higher-cognitive-ability/), such that you could extract a ["general factor of athleticism"](https://isteve.blogspot.com/2007/09/g-factor-of-sports.html). (It would be really interesting if anyone's actually done this using the same methodology used to construct IQ tests!) But _athleticism_ is going to be an _very_ "coarse" construct for which [the tails come apart](https://www.lesswrong.com/posts/dC7mP5nSwvpL65Qu5/why-the-tails-come-apart): for example, world champion 100-meter sprinter Usain Bolt's best time in the _800_ meters is [reportedly only around 2:10](https://www.newyorker.com/sports/sporting-scene/how-fast-would-usain-bolt-run-the-mile) [or 2:07](https://archive.is/T988h)! (For comparison, _I_ ran a 2:08.3 in high school once.)
+It's important not to overinterpret the _g_ factor as some unitary essence of intelligence rather than the length of a hyperellipsoid. It seems likely that [if you gave people a bunch of _physical_ tests, they would positively correlate with each other](https://www.talyarkoni.org/blog/2010/03/07/what-the-general-factor-of-intelligence-is-and-isnt-or-why-intuitive-unitarianism-is-a-lousy-guide-to-the-neurobiology-of-higher-cognitive-ability/), such that you could extract a ["general factor of athleticism"](https://isteve.blogspot.com/2007/09/g-factor-of-sports.html). (It would be really interesting if anyone's actually done this using the same methodology used to construct IQ tests!) But _athleticism_ is going to be an _very_ "coarse" construct for which [the tails come apart](https://www.lesswrong.com/posts/dC7mP5nSwvpL65Qu5/why-the-tails-come-apart): for example, world champion 100-meter sprinter Usain Bolt's best time in the _800_ meters is [reportedly only around 2:10](https://www.newyorker.com/sports/sporting-scene/how-fast-would-usain-bolt-run-the-mile) [or 2:07](https://archive.is/T988h)! (For comparison, _I_ ran a 2:08.3 in high school once!)
-Anyway, so Murray and Herrnstein talk about this "intelligence" construct, and how it's heritable, and how it predicts income, school success, not being a criminal, _&c._, and how this has all sorts of implications for Society and inequality and class structure and stuff. [TODO: mention "Coming Apart" thesis?]
+Anyway, so Murray and Herrnstein talk about this "intelligence" construct, and how it's heritable, and how it predicts income, school success, not being a criminal, _&c._, and how Society is becoming increasingly stratified by cognitive abilities, as school credentials become the ticket to the upper and upper-middle classes.
This _should_ just be more social-science nerd stuff, the sort of thing that would only draw your attention if, like me, you feel bad about not being smart enough to do algebraic topology and want to console yourself by at least knowing about the Science of not being smart enough to do algebraic topology. The reason everyone _and her dog_ is still mad at Charles Murray a quarter of a century later is Chapter 13, "Ethnic Differences in Cognitive Ability", and Chapter 14, "Ethnic Inequalities in Relation to IQ". So, _apparently_, different ethnic/"racial" groups have different average scores on IQ tests. [Ashkenazi Jews do the best](https://slatestarcodex.com/2017/05/26/the-atomic-bomb-considered-as-hungarian-high-school-science-fair-project/), which is why I sometimes privately joke that the fact that I'm [only 85% Ashkenazi (according to 23andMe)](/images/ancestry_report.png) explains my low IQ. ([I got a 131](/images/wisc-iii_result.jpg) on the [WISC-III](https://en.wikipedia.org/wiki/Wechsler_Intelligence_Scale_for_Children) at age 10, but that's pretty dumb compared to some of my [robot-cult](/tag/my-robot-cult/) friends.) East Asians do a little better than Europeans/"whites". And—this is the part that no one is happy about—the difference between U.S. whites and U.S. blacks is about Cohen's _d_ ≈ 1. (If two groups differ by _d_ = 1 on some measurement that's normally distributed within each group, that means that the mean of the group with the lower average measurement is at the 16th percentile of the group with the higher average measurement, or that a uniformly-randomly selected member of the group with the higher average measurement has a probability of about 0.76 have having a higher measurement than a uniformly-randomly selected member of the group with the lower average measurement.)
+Given the tendency for people to distort shared maps for political reasons, you can see why this is a hotly contentious line of research. Even if you take the test numbers at face value, racists trying to secure unjust privileges for groups that score well, have an incentive to "play up" group IQ differences in bad faith even when they shouldn't be [relevant](https://www.lesswrong.com/posts/GSz8SrKFfW7fJK2wN/relevance-norms-or-gricean-implicature-queers-the-decoupling). As economist Glenn C. Loury points out in _The Anatomy of Racial Inequality_, cognitive abilities decline with _age_, and yet we don't see a moral panic about the consequences of an aging workforce, because older people are construed as an "us"—our mothers and fathers—rather than an outgroup. _Individual_ differences in intelligence are also presumably less politically threatening because "smart people" as a group aren't construed as a natural political coalition—although Murray's work on cognitive class stratification seems to suggest this intuition is mistaken.
+
It's important not to overinterpret the IQ-scores-by-race results; there are a bunch of standard caveats that go here that everyone's treatment of the topic needs to include. Again, just because variance in a trait is statistically associated with variance in genes _within_ a population, does _not_ mean that differences in that trait _between_ populations are _caused_ by genes: [remember the illustrations about](#heritability-caveats) sun-deprived plants and internet-deprived red-haired children. Group differences in observed tested IQs are entirely compatible with a world in which those differences are entirely due to the environment imposed by an overtly or structurally racist society. Maybe the tests are culturally biased. Maybe people with higher socioeconomic status get more opportunities to develop their intellect, and racism impedes socio-economic mobility. And so on.
The problem is, a lot of the blank-slatey environmentally-caused-differences-only hypotheses for group IQ differences start to look less compelling when you look into the details. "Maybe the tests are biased", for example, isn't an insurmountable defeater to the entire endeavor of IQ testing—it is _itself_ a falsifiable hypothesis, or can become one if you specify what you mean by "bias" in detail. One idea of what it would mean for a test to be _biased_ is if it's partially measuring something other than what it purports to be measuring: if your test measures a _combination_ of "intelligence" and "submission to the hegemonic cultural dictates of the test-maker", then individuals and groups that submit less to your cultural hegemony are going to score worse, and if you _market_ your test as unbiasedly measuring intelligence, then people who believe your marketing copy will be misled into thinking that those who don't submit are dumber than they really are. But if so, and if not all of your individual test questions are _equally_ loaded on intelligence and cultural-hegemony, then the cultural bias should _show up in the statistics_. If some questions are more "fair" and others are relatively more culture-biased, then you would expect the _order of item difficulties_ to differ by culture: the ["item characteristic curve"](/papers/baker-kim-the_item_characteristic_curve.pdf) plotting the probability of getting a biased question "right" as a function of _overall_ test score should differ by culture, with the hegemonic group finding it "easier" and others finding it "harder". Conversely, if the questions that discriminate most between differently-scoring cultural/ethnic/"racial" groups were the same as the questions that discriminate between (say) younger and older children _within_ each group, that would be the kind of statistical clue you would expect to see if the test was unbiased and the group difference was real.
I used to be a naïve egalitarian. I was very passionate about it. I was eighteen years old. I am—again—still fond of the moral sentiment, and eager to renormalize it into something that makes sense. (Some egalitarian anxieties do translate perfectly well into the Bayesian setting, as I'll explain in a moment.) But the abject horror I felt at eighteen at the mere suggestion of _making generalizations_ about _people_ just—doesn't make sense. Not that it _shouldn't_ be practiced (it's not that my heart wasn't in the right place), but that it _can't_ be practiced—that the people who think they're practicing it are just confused about how their own minds work.
-Give people photographs of various women and men and ask them to judge how tall the people in the photos are, as [Nelson _et al._ 1990 did](/papers/nelson_et_al-everyday_base_rates_sex_stereotypes_potent_and_resilient.pdf), and people's guesses reflect both the photo-subjects' actual heights, but also (to a lesser degree) their sex. Unless you expect people to be perfect at assessing height from photographs (when they don't know how far away the cameraperson was standing, aren't ["trigonometrically omniscient"](https://plato.stanford.edu/entries/logic-epistemic/#LogiOmni), _&c._), this behavior is just _correct_: men really are taller than women on average (I've seen _d_ ≈ 1.4–1.7 depending on the source), so P(true-height|apparent-height, sex) ≠ P(height|apparent-height) because of [regression to the mean](https://en.wikipedia.org/wiki/Regression_toward_the_mean) (and women and men regress to different means). But [this all happens subconsciously](/2020/Apr/peering-through-reverent-fingers/): in the same study, when the authors tried height-matching the photographs (for every photo of a woman of a given height, there was another photo in the set of a man of the same height) _and telling_ the participants about the height-matching _and_ offering a cash reward to the best height-judge, more than half of the stereotyping effect remained. It would seem that people can't consciously readjust their learned priors in reaction to verbal instructions pertaining to an artificial context.
+Give people photographs of various women and men and ask them to judge how tall the people in the photos are, as [Nelson _et al._ 1990 did](/papers/nelson_et_al-everyday_base_rates_sex_stereotypes_potent_and_resilient.pdf), and people's guesses reflect both the photo-subjects' actual heights, but also (to a lesser degree) their sex. Unless you expect people to be perfect at assessing height from photographs (when they don't know how far away the cameraperson was standing, aren't ["trigonometrically omniscient"](https://plato.stanford.edu/entries/logic-epistemic/#LogiOmni), _&c._), this behavior is just _correct_: men really are taller than women on average, so P(true-height|apparent-height, sex) ≠ P(height|apparent-height) [because of](https://humanvarieties.org/2017/07/01/measurement-error-regression-to-the-mean-and-group-differences/) [regression to the mean](https://en.wikipedia.org/wiki/Regression_toward_the_mean) (and women and men regress to different means). But [this all happens subconsciously](/2020/Apr/peering-through-reverent-fingers/): in the same study, when the authors tried height-matching the photographs (for every photo of a woman of a given height, there was another photo in the set of a man of the same height) _and telling_ the participants about the height-matching _and_ offering a cash reward to the best height-judge, more than half of the stereotyping effect remained. It would seem that people can't consciously readjust their learned priors in reaction to verbal instructions pertaining to an artificial context.
Once you understand at a _technical_ level that probabilistic reasoning about demographic features is both epistemically justified, _and_ implicitly implemented as part of the way your brain processes information _anyway_, then a moral theory that forbids this starts to look less compelling? Of course, statistical discrimination on demographic features is only epistemically justified to exactly the extent that it helps _get the right answer_. Renormalized-egalitarians can still be properly outraged about the monstrous tragedies where I have moral property P but I _can't prove it to you_, so you instead guess _incorrectly_ that I don't just because other people who look like me mostly don't, and you don't have any better information to go on—or tragedies in which a feedback loop between predictions and social norms creates or amplifies group differences that wouldn't exist under some other social equilibrium.