+[^clarification-quibbles]: The way that the post takes pains to cast doubt on whether someone who is alleged to have committed the categories-are-arbitrary fallacy is likely to have actually committed it ("the mistake seems like it wouldn't actually fool anybody or be committed in real life, I am unlikely to be sympathetic to the argument", "But be wary of accusing somebody of planning to do this, if you haven't documented them actually doing it") is in stark contrast to the way that "A Human's Guide to Words" had taken pains to emphasize that categories shape cognition regardless of whether someone is consciously trying to trick you (["drawing a boundary in thingspace is not a neutral act [...] Categories are not static things in the context of a human brain; as soon as you actually think of them, they exert force on your mind"](https://www.lesswrong.com/posts/veN86cBhoe7mBxXLk/categorizing-has-consequences)). I'm suspicious that the change in emphasis reflects the need to not be seen as criticizing the "pro-trans" coalition, rather than any new insight into the subject matter.
+
+ The first comment on the post linked to "... Not Man for the Categories". Yudkowsky replied, "I assumed everybody reading this had already read [https://wiki.lesswrong.com/wiki/A_Human's_Guide_to_Words](https://wiki.lesswrong.com/wiki/A_Human's_Guide_to_Words)", a _non sequitur_ that could be taken to suggest (but did not explicitly say) that the moral of "... Not Man for the Categories" was implied by "A Human's Guide to Words" (in contrast to my contention that "... Not Man for the Categories" was getting it wrong).
+
+I wrote to Michael, Ben, Jessica, Sarah, and "Riley", thanking them for their support. After successfully bullying Scott and Eliezer into clarifying, I was no longer at war with the robot cult and feeling a lot better (Subject: "thank-you note (the end of the Category War)").
+
+I had a feeling, I added, that Ben might be disappointed with the thank-you note insofar as it could be read as me having been "bought off" rather than being fully on the side of clarity-creation. But I contended that not being at war actually made it emotionally easier to do clarity-creation writing. Now I would be able to do it in a contemplative spirit of "Here's what I think the thing is actually doing" rather than in hatred with [flames on the side of my face](https://www.youtube.com/watch?v=nrqxmQr-uto&t=112s).
+
+-----
+
+There's a dramatic episode that would fit here chronologically if this were an autobiography (which existed to tell my life story), but since this is a topic-focused memoir (which exists because my life happens to contain this Whole Dumb Story which bears on matters of broader interest, even if my life would not otherwise be interesting), I don't want to spend more wordcount than is needed to briefly describe the essentials.
+
+I was charged by members of the extended Michael Vassar–adjacent social circle with the duty of taking care of a mentally-ill person at my house on 18 December 2020. (We did not trust the ordinary psychiatric system to act in patients' interests.) I apparently did a poor job, and ended up saying something callous on the care team group chat after a stressful night, which led to a chaotic day on the nineteenth, and an ugly falling-out between me and the group. The details aren't particularly of public interest.
+
+My poor performance during this incident [weighs on my conscience](/2020/Dec/liability/) particularly because I had [previously](/2017/Mar/fresh-princess/) [been](/2017/Jun/memoirs-of-my-recent-madness-part-i-the-unanswerable-words/) in the position of being crazy and benefiting from the help of my friends (including many of the same people involved in this incident) rather than getting sent back to psychiatric prison ("hospital", they call it a "hospital"). Of all people, I had a special debt to "pay it forward", and one might have hoped that I would also have special skills, that having been on the receiving end of a non-institutional psychiatric tripsitting operation would help me know what to do on the giving end. Neither of those panned out.
+
+Some might appeal to the proverb "All's well that ends well", noting that the person in trouble ended up recovering, and that, while the stress of the incident contributed to a somewhat serious relapse of my own psychological problems on the night of the nineteenth and in the following weeks, I ended up recovering, too. But recovering normal functionality after a traumatic episode doesn't imply a lack of other lasting consequences (to the psyche, to trusting relationships, _&c._). I am therefore inclined to dwell on [another proverb](https://www.alessonislearned.com/), "A lesson is learned but the damage is irreversible."
+
+-----
+
+I published ["Unnatural Categories Are Optimized for Deception"](https://www.lesswrong.com/posts/onwgTH6n8wxRSo2BJ/unnatural-categories-are-optimized-for-deception) in January 2021.
+
+I wrote back to Abram Demski regarding his comments from fourteen months before: on further thought, he was right. Even granting my point that evolution didn't figure out how to track probability and utility separately, as Abram had pointed out, the fact that it didn't meant that not tracking it could be an effective AI design. Just because evolution takes shortcuts that human engineers wouldn't didn't mean shortcuts are "wrong". (Rather, there are laws governing which kinds of shortcuts work.)
+
+Abram was also right that it would be weird if reflective coherence was somehow impossible: the AI shouldn't have to fundamentally reason differently about "rewriting code in some 'external' program" and "rewriting 'its own' code." In that light, it made sense to regard "have accurate beliefs" as merely a convergent instrumental subgoal, rather than what rationality is about—as sacrilegious as that felt to type.
+
+And yet, somehow, "have accurate beliefs" seemed more fundamental than other convergent instrumental subgoals like "seek power and resources". Could this be made precise? As a stab in the dark, was it possible that the [theorems on the ubiquity of power-seeking](https://www.lesswrong.com/posts/6DuJxY8X45Sco4bS2/seeking-power-is-often-robustly-instrumental-in-mdps) might generalize to a similar conclusion about "accuracy-seeking"? If it didn't, the reason why it didn't might explain why accuracy seems more fundamental.
+
+------