Comment on a Scene from Planecrash: "Crisis of Faith"

Sun 12 June 2022

tagged Eliezer Yudkowsky, literary critcism, worldbuilding

Realistic worldbuilding is a difficult art: unable to model what someone else would do except by the "empathic inference" of imagining oneself in that position, authors tend to embarrass themselves writing alleged aliens or AIs that just happen act like humans, or allegedly foreign cultures that just happen to share all of the idiosyncratic taboos of the author's own culture. The manifestations of this can be very subtle, even to authors who know about the trap.

In Planecrash, a collaborative roleplaying fiction principally by Iarwain (a pen name of Eliezer Yudkowsky) and Lintamande, our protagonist, Keltham, hails from dath ilan, a smarter, more rational, and better-coordinated alternate version of Earth. Keltham has somehow survived his apparent death and woken up in the fantasy world of Golarion, and sets about uplifting the natives using knowledge from his more advanced civilization.

In the "Crisis of Faith" thread, Keltham has just arrived in the country of Osirion. While much better than his last host nation (don't ask), Keltham is dismayed at its patriarchal culture in which women typically are not educated and cannot own property, and is considering his options for reforming the culture in conjunction with sharing his civilization's knowledge. Having been advised to survey what native women think of their plight before seeking to upend their social order, Keltham asks an middle-aged woman:

Suppose some dreadful meddling foreigner came in and told Osirion that its laws had to be the same for men and women, and halflings and tieflings and elves too, but men and women are the main focus here. You can make a law that the person with higher Wisdom gets to be in charge of the household; you can make a law about asking people under truthspell if they've ever gotten drunk and hurt somebody; you can't make any law that talks about whether or not somebody has a penis. You can talk about whether somebody has a child, but not whether that person was mother or father, the child girl or boy.

In the conversation that follows, the woman suggests military conscription as a legitimate reason for why the law might need to discriminate on sex. Keltham suggests, "Test people on combat ability, truthspell them to see if they were sandbagging it."

... and that's the part that broke my suspension of disbelief in Keltham being a realistic portrayal of someone who grew up in dath ilan as it has been described to us, rather than being written by people who live in Berkeley in the current year who don't know how to think outside of their own culture's assumptions.

To be clear, it makes sense that Keltham feels bad for the women of Orision, who seem so much less self-actualized than the women of his world. It makes sense that he wants to smash the patriarchy, and reform their sexist customs about education and property.

But the specific way in which he's formulating the problem—that the law should be "the same for men and women, and halflings and tieflings and elves too"—seems distinctively American. The idea that the government can't discriminate by race or sex as a principle (as contrasted to most laws happening to not refer to race or sex because those categories happen to not be relevant to that specific law) is a specific form of Earth-craziness that only makes sense as a reaction to other Earth-craziness; it's not something you would ever spontaneously invent or think was a good idea if you actually came from a 140 IQ Society that thoroughly educated everyone in probability theory as normative reasoning. Let me explain.

Keltham is, of course, correct that if you have specific information about an individual's traits, that screens off any probabilistic guesses you might have made about those traits knowing only the person's demographic category. Once you measure someone's height, the fact that men are taller than women on average with an effect size of about 1.5 standard deviations is no longer relevant to the question of that person's height. (As the saying goes out of dath ilan, hug the query!) In very many situations, if there's a cost associated with acquiring more specific individuating information that renders information from demographic base rates irrelevant, you should pay that cost in order to get the more specific information and therefore make better decisions.

But crucially, getting individuating information is an instrumental rather than a terminal value; you should do it when and because it improves your decisions, not because of some alleged principle that you're not allowed to make probabilistic inferences off someone's race or sex. Probability theory doesn't have any built-in concept of "protected classes." On pain of paradox, Bayesians must condition on all available information. If groups differ in decision-relevant traits, of course you should treat members of those groups differently! What we call "discrimination" in America on Earth is actually just Bayesian reasoning; P(H|E) = P(E|H)P(H)/P(E) doesn't stop being true when H happens to be "I should hire this candidate" and E happens to be "The candidate is a halfling".

Furthermore, it's not obvious that the law should behave any differently in this respect than a private individual: is Governance supposed to be less Bayesian because it's Governance?! (Although, perhaps there's a distinction between the "law" and "public policy" functions of Governance, with the former laying out timeless rights and principles, whereas day-to-day decisions about the empirical world are farmed out to the latter?)

Some implications: if there's a cost associated with taking individual measurements, and the cost exceeds the amount you would save by making better decisions, then you shouldn't take the measurements. If your measurements have error, then your estimate of the true value of the trait being measured regresses to the group mean to some quantitative extent. Again, all this just falls out of ordinary Bayesian decision theory, which continues to work even when some of the hypotheses are about groups of people.

If this still seems counterintuitive, it may help to consider that from the standpoint of Just Doing Bayesian Decision Theory, the distinction between "information from demographic group membership" and "information from individual measurements" isn't fundamental. The reason it seems unjust to notice race when you can just look at an individual's Strength, Intelligence, Wisdom, and Charisma scores, is because the relationship between race and any actual decision you might care about is merely statistical: it's not fair to always look to the orc if you need someone in your party to lift a fallen tree, just because orcs are stronger than other races on average, because it could easily be the case that this particular orc is less suited to the task than other party members.

But the relationship between "measured traits" and any actual decision you might care about is also merely statistical. The reason we have a concept of "Intelligence" is because it turns out that people's performances on various mental tasks happen to positively correlate with each other, but that's just on average: it could easily be the case that this particular Intelligence 18 person is less suited to a particular task than some Intelligence 14 person. Mathematically, it's the same issue.

We don't typically think of it as the same issue here in America on Earth. People do sometimes complain about inappropriate reliance on faulty "individual trait" proxies: that holding a college degree isn't the same thing as being educated, that job interviews aren't the same thing as job performance, that IQ is not intelligence. But the objection doesn't pack the same moral force in our culture, as can be seen by how often complaints about "individual" proxies are justified in terms of their effects on demographic groups, as when it is argued that "whiteboard" coding tests are bad for diversity, or that IQ is racist.

The explanation for the difference in intuitions is as much political as it is moral. On account of being visible clusters in a "thick" subspace of configuration space (having many different correlates, even if the effect size along any one dimension may not be very large), race and sex are salient as markers for coordination. Groupings made on the basis of less visible and lower-dimensional traits, like "People with Intelligence 14", don't form a natural "interest group" in the same way, even if the lower-dimensional trait is more decision-relevant in many contexts. Conflict between interest groups in a democratic Society like America creates memetic selection pressure for "equality" memes that deny the existence of non-superficial group differences, as the natural Schelling point for preventing group conflicts. It's an idea born of distrust in reasoning in an adversarial environment: if you let people make probabilistic inferences using race or sex as inputs, they might motivatedly try to add bad inferences to Society's shared maps that would give their own demographic an advantage in conflicts. It's safer to nip such Shenanigans in the bud by disallowing the whole class of thought to begin with: can't oppress people on the basis of race if race doesn't exist!

But Keltham isn't from America; you'd expect his thoughts to optimized for solving problems, not disallowing Shenanigans. Everything we've been told about dath ilan emphasizes that they should be collectively smart enough not to fall into this crazy trap of political incentives making a certain class of correct Bayesian updates socially taboo in order to avert other social ills; the Keepers should have pre-emptively done the analysis in the preceding paragraph without having to empirically see it eat their Society's sanity, and incorporated the appropriate counter-memes in their rationality training for children. To the dath ilani intuition, then, the quantitative extent to which the statement "It's wrong to make X decision about someone just because they're Y" makes sense, depends quantitatively on how strongly Y predicts the outcomes of X. Whether Y is an "individual trait" like having Intelligence 18 or a demographic category like being female does not matter.

This is also how American people's intuitions work, too, in contexts where their paranoid egalitarian meliorist memetic antibodies haven't been activated. Consider how the text of Planecrash itself repeatedly contrasts Keltham to everyone else in the world of Golarion. No one (neither Watsonianly in the text, nor Doylistically in various discussions of the text on Discord) is shy about saying that Keltham is special in this setting because he's dath ilani. We don't insist on talking about how Keltham is smart and knows about probability theory and knows about chemistry and doesn't know about Golarionian theology and is accustomed to a high material standard of living and is squeamish about seeing slave markets, as if these were separate, isolated facts about Keltham as an idiosyncratic individual. We connect these facts to Keltham's nationality even though, if you look, there are surely also natives of Golarion who are smart (to some quantitative extent) and know about chemistry (to some quantitative extent) and disapprove of slavery (to some quantitative extent), because our whole high-dimensional picture of what Keltham is—comprising many, many traits to their respective quantitative extents—is, in fact, causally downstream of the "essential" fact of his having grown up in another world. It's either not bigoted to notice, or a cognitive system requires some amount of "bigotry" in order to function.

However, just because noticing group differences is theoretically sound, doesn't mean it's always the right thing to focus on. Pragmatically, might it not be the case in practice, that statistical group differences are small enough, and that individual trait measurements are cheap and reliable enough, such that "don't discriminate by race or sex" is a useful heuristic?

It's an empirical issue—but sure, very often, yes. For most jobs—especially most jobs in industrialized Societies like dath ilan or America—"always test the individual's aptitude, never use sex as a proxy" is a fine rule, because most jobs primarily rely on human general intelligence: there was no dentistry in the environment of evolutionary adaptedness, and thus there's no reason why women or men should make better dentists. In domains where sex differences are small, using sex as a proxy would just be dumb, not unjust.

But then it's bizarre that Keltham persists in his no-legal-sex-discrimination stance when his interlocutor brings up military conscription as a potential counterexample. Because, well, as unpleasant as it is for modern folk to think about ... there was war in the environment of evolutionary adaptedness. Men's bodies are built for war. Men's emotions are built for war. (Males have more reproductive fitness to gain and less to lose by the prospect of risking death in a war where the victors gain mating opportunities.) The sex difference in muscle mass is 2.6 standard deviations. That means a woman as strong as the average man is at the 99.5th percentile for women. That means if you just select everyone whose strength is greater than one standard deviation below the male mean, you end up excluding 94.5% of women.

Notwithstanding that Keltham grew up in a peaceful industrialized Society that screened off its history (such that he wouldn't have read histories of some analogue of Genghis Khan), it seems like Keltham should know this stuff? We're told that dath ilan has very advanced evolutionary psychology, and there's no apparent reason for them to have spent any of their eugenics bandwidth selecting for reduced sexual dimorphism. (Although given the Purely Aesthetic Gender in Pathfinder, it seems reasonable to posit reduced sexual dimorphism in Golarion?) If dath ilan doesn't have enough (non-counterfactual) violence to make strength differences salient, do they have sports? (In the peaceful industrialized Society where I grew up, it was salient my mediocre cross-country times were often better than the best girls' times.) We're told that ordinary dath ilani are good at reasoning about effect sizes.

But if Keltham does know this stuff, why is he talking like a UC Berkeley graduate? "Strength is an externally visible and measurable quality that determines who you want in your army; you don't need to go by the presence of penises," he says. When his interlocutor objects that strong women would get drafted, which would be terrible, Keltham asks how it would be more terrible than men getting drafted. When the interlocutor replies that the woman's marriage prospects would be damaged by a history living in close quarters with men in the army, Keltham muses that it sounds like she's implying that "the army would need strong enough internal governance to prevent women in it from being raped, but you could do that with cheaper truthspells?"

There's just so much wrong with this exchange from the perspective of anyone who knows anything about humans and isn't playing dumb for a religious American audience.

Firstly, if you decided that strength is the quality that determines who you want in your army, you should notice that you're going to be drafting almost all men anyway. (Again, a sex difference of 2.6 standard deviations and a selection threshold 1 standard deviation below the male mean gives you a male:female ratio of (1 − Φ(−1))/(1 − Φ(1.6)) ≈ 15.4:1, where Φ is the cumulative distribution function of the normal distribution.)

To this, the Berkeley graduate might reply, "So then the optimal army has 15 men for every woman; what's the problem with that? Surely you don't want to make your army less strong just to satisfy some weird æsthetic that all your soldiers should have the same kind of genitals?"

A minor counterreply would be that, if people's sex is public information but there are administrative costs associated with strength-testing everyone, you probably wouldn't bother testing the women, for the same reason that, if you were mining for spellsilver ore, and one mine had fifteen times as much ore as the other, you wouldn't even set up your tools at the poorer mine until you had completely exhausted the first.

But more fundamentally, even if you assume strength-testing is free, we haven't yet taken into account all other sex differences that are relevant to military performance. It's not just that any other individual traits (e.g., aggression) that you select for will stack multiplicatively, resulting in even more extreme ratios. There are also group-level effects that aren't captured by measuring the traits of individual soldiers: the social dynamics of a squad of fifteen men and one woman are going to be different from those of a squad of sixteen men. Even if you've selected the woman for strength and every martial virtue to equal any man, do the men know that in their subconscious, or are they going to be biased to want to protect her or seek her favor in a way that they wouldn't in an all-male environment?

You could command them not to—but does that actually work? People don't have conscious access to or control of the way their brain takes demographic base rates into account. Nelson et al. 1990 gave people photographs of women and men and asked them to estimate the photo-subjects' heights. The estimates end up reflecting sex as well as actual-height—which is, again, the correct Bayesian behavior given uncertainty in sex-blind estimates. But furthermore, when the researchers prepared a special height-matched set of photos (where for every woman of a given height, there was a man of the same height in the photo set) and told the participants about the height-matching and offered cash rewards for accuracy, more than half of the base-rate adjustment still remained! People don't know how to turn it off!

And if they could turn it off, such that you could order your male soldiers not to treat a woman among them any differently than they would a man, and have the verbal instruction have exactly the desired effect on their brain's subconscious quantitative decisionmaking machinery—who is this even helping, exactly?

Keltham expresses doubt whether it's worse for a woman to be conscripted than a man, and when his interlocutor gestures at harms to a woman from living among men (not trusted family members, but men unselected from the general public), Keltham understands that she's talking about the possibility of intercourse, including rape (!), and he immediately generates "cheap truthspells" as a way to mitigate that problem while maintaining sex-integrated military units.

And, sure, I agree that truthspells would help, given the assumption that you need to have sex-integrated military units. But—why is that a desideratum, at all? We're told that dath ilan's beliefs about evolutionary psychology include the idea that:

The untrained male has an instinct to seize and guard a woman's reproductive capacity, instinctively using violence to stop her from interacting with other men at the same that he instinctively displays other forms of commitment to try to earn her acquiescence. The untrained female has adaptations that assume an environment in which men will try to pressure her into more sex than is optimal for her own reproductive fitness, so her adaptations push her to instinctively resist that pressure while also instinctively trying to increase the number and quality of men who'll be interested in her.

And just—if you actually believe that, it seems like there's this very obvious policy of not forcing females to fight in close quarters alongside the people with an instinct to seize and guard female reproductive capacity?! (Come to think of it, the "instinctively trying to increase the number and quality of men who'll be interested in her" part seems like it could cause other kinds of problems, too??) Even if you have cheap truthspells, there's this concept of 'securitymindset', where you want to design systems that are robust against unexpected things happening, and the "Just don't conscript women in the first place" policy neatly sidesteps entire classes of potential social pathologies that you don't want to have to deal with at all in the organization you're using to keep your country from getting conquered?! If someone asks whether it's worse for a woman or a man to be put in the situation of having to fight in close quarters alongside people with an instinct to seize and guard female reproductive capacity, I don't think it should be hard to admit the obvious correct answer that that's worse for a woman?!

I mean, it's not worse with Probability One. Like any dath ilani or religiously devout American, I cherish diversity and exceptions, and want to treat people who are unusual for their demographic with the same care and respect as anyone else! (More, actually.) It's just—it seems like it should be possible to do that without trashing our ability to have conventions that perform well in the average case?? To the extent that there is a minority of women who want nothing more than to die gloriously in battle in service to their country, then, sure, you'd want and expect the country to be able to make use of that—and whether you want to induct them into the regular army, or have a special women's corps is a complicated policy question that you'd want to make after appropriately weighing all of the trade-offs (like the unit-cohesion objection vs. less skill transfer due to not having cross-sex mentorships).

It's just—wasn't dath ilan's whole thing supposed to be about coordinating to find the optimal multi-agent policy using evidence and quantitative reasoning?! And suddenly Keltham is casually proposing "stopp[ing] being able to measure people's sex and treat them differently based on that" without noticing that this is excluding huge swathes of policyspace (such as "conscript males, but accept female volunteers") for ideological reasons!? I feel like I'm taking crazy pills!!

Maybe there's just no way to explain this in a way that makes sense to American ears? I still feel guilty writing this stuff. It's just—I was trained, long ago back in the 'aughts, in a certain Art, and I'm pretty sure we were taught that being able to measure things and make different decisions based on the measurements was a good thing in full generality, without there being any special exception that specific cluster-membership measurements are actually bad?!

(Thanks for Ilzo for feedback.)