+> If I fell asleep and woke up as a true woman—not in body, but in brain—I don't think I'd call her "me". The change is too sharp, if it happens all at once.
+
+In the comments, [I wrote](https://www.greaterwrong.com/posts/QZs4vkC7cbyjL9XA9/changing-emotions/comment/4pttT7gQYLpfqCsNd)—
+
+> Is it cheating if you deliberately define your personal identity such that the answer is _No_?
+
+To which I now realize the correct answer is—_yes!_ Yes, it's cheating! Category-membership claims of the form "X is a Y" [represent hidden probabilistic inferences](https://www.lesswrong.com/posts/3nxs2WYDGzJbzcLMp/words-as-hidden-inferences); inferring that entity X is a member of category Y means [using observations about X to decide to use knowledge about members of Y to make predictions about features of X that you haven't observed yet](https://www.lesswrong.com/posts/gDWvLicHhcMfGmwaK/conditional-independence-and-naive-bayes). But this AI trick can only _work_ if the entities you've assigned to category Y are _actually_ similar—if they form a tight cluster in configuration space, such that using the center of the cluster to make predictions about unobserved features gets you _close_ to the right answer, on average.
+
+The rules don't change when the entity X happens to be "my female analogue" and the category Y happens to be "me". The ordinary concept of "personal identity" tracks how the high-level features of individual human organisms are stable over time. You're going to want to model me-on-Monday and me-on-Thursday as "the same" person even if my Thursday-self woke up on the wrong side of bed and has three whole days of new memories. When interacting with my Thursday-self, you're going to be using your existing mental model of me, plus a diff for "He's grumpy" and "Haven't seen him in three days"—but that's a _very small_ diff, compared to the diff between me and some other specific person you know, or the diff between me and a generic human who you don't know.
+
+In everyday life, we're almost never in doubt as to which entities we want to consider "the same" person (like me-on-Monday and me-on-Thursday), but we can concoct science-fictional thought experiments that force [the Sorites problem](https://plato.stanford.edu/entries/sorites-paradox/) to come up. What if you could _interpolate_ between two people—construct a human with a personality "in between" yours and mine, that had both or some fraction of each of our memories? (You know, like [Tuvix](https://memory-alpha.fandom.com/wiki/Tuvix_(episode)).) At what point on the spectrum would that person be me, or you, or both, or neither? (Derek Parfit has [a book](https://en.wikipedia.org/wiki/Reasons_and_Persons#Personal_identity) with lots of these.)
+
+People _do_ change a lot over time; there _is_ a sense in which, in some contexts, we _don't_ want to say that a sixty-year-old is the "same person" they were when they were twenty—and forty years is "only" 4,870 three-day increments. But if a twenty-year-old were to be magically replaced with their sixty-year-old future self (not just superficially wearing an older body like a suit of clothing, but their brain actually encoding forty more years of experience and decay) ... well, there's a reason I reached for the word "replace" (suggesting putting a _different_ thing in something's place) when describing the scenario. That's what Yudkowsky means by "the change is too sharp"—the _ordinary_ sense in which we model people as the "same person" from day to day (despite people having [more than one proton](/2019/Dec/on-the-argumentative-form-super-proton-things-tend-to-come-in-varieties/) in a different place from day to day) has an implicit [Lipschitz condition](https://en.wikipedia.org/wiki/Lipschitz_continuity) buried in it, an assumption that people don't change _too fast_.
+
+The thing about Sorites problems is that they're _incredibly boring_. The map is not the territory. The distribution of sand-configurations we face in everyday life is such that we usually have an answer as to whether the sand "is a heap" or "is not a heap", but in the edge-cases where we're not sure, arguing about whether to use the word "heap" _doesn't change the configuration of sand_. You might think that if [the category is blurry](https://www.lesswrong.com/posts/dLJv2CoRCgeC2mPgj/the-fallacy-of-gray), you therefore have some freedom to [draw its boundaries](https://www.lesswrong.com/posts/d5NyJ2Lf6N22AD9PB/where-to-draw-the-boundary) the way you prefer—but [the cognitive function of the category is for making probabilistic inferences on the basis of category-membership](https://www.lesswrong.com/posts/esRZaPXSHgWzyB2NL/where-to-draw-the-boundaries), and those probabilistic inferences can be quantitatively better or worse. [Preferences over concept definitions that aren't about maximizing predictive accuracy are therefore preferences _for deception_](https://www.lesswrong.com/posts/onwgTH6n8wxRSo2BJ/unnatural-categories-are-optimized-for-deception), because ["making probability distributions less accurate in order to achieve some other goal" is what _deception_ means](https://www.lesswrong.com/posts/YptSN8riyXJjJ8Qp8/maybe-lying-can-t-exist).
+
+That's why defining your personal identity to get the answer you want is cheating. If the answer you wanted was actually _true_, you could just say so without needing to _want_ it.
+
+When [Phineas Gage's](/2017/Dec/interlude-xi/) friends [said he was "no longer Gage"](https://en.wikipedia.org/wiki/Phineas_Gage) after the railroad accident, what they were trying to say was that interacting with post-accident Gage was _more relevantly similar_ to interacting with a stranger than it was to interacting with pre-accident Gage, even if Gage-the-physical-organism was contiguous along the whole strech of space time.
+
+Same principle when Yudkowsky wrote, "If I fell asleep and woke up as a true woman [...] I don't think I'd call her 'me'". The claim is that psychological sex differences are large enough to violate the Lipschitz condition imposed by our _ordinary_ concept of personal identity. Maybe he was wrong, but if so, that cashes out as being wrong _about_ how similar women and men actually are (which in principle could be operationalized and precisely computed, even if _we_ don't know how to make it precise), _not_ whether we prefer the "call her me" or "don't call her me" conclusion and want to _retroactively redefine the meaning of the words in order to make the claim come out "true."_