memoir: TODO outlining in 2019 and to close

[Ultimately_Untrue_Thought.git] / content / drafts / a-hill-of-validity-in-defense-of-meaning.md
diff --git a/content/drafts/a-hill-of-validity-in-defense-of-meaning.md b/content/drafts/a-hill-of-validity-in-defense-of-meaning.md

index 66bc3f2..2d3c84f 100644 (file)
--- a/content/drafts/a-hill-of-validity-in-defense-of-meaning.md
+++ b/content/drafts/a-hill-of-validity-in-defense-of-meaning.md
@@ -182,15 +182,15 @@ But if Yudkowsky didn't want to get into a distracting political fight about a t
  
  But trusting Eliezer Yudkowsky—whose writings, more than any other single influence, had made me who I am—_did_ seem reasonable. If I put him on a pedastal, it was because he had earned the pedastal, for supplying me with my criteria for how to think—including, as a trivial special case, [how to think about what things to put on pedastals](https://www.lesswrong.com/posts/YC3ArwKM8xhNjYqQK/on-things-that-are-awesome).
  
-So if the rationalists were going to get our own philosophy of language wrong over this _and Eliezer Yudkowsky was in on it_ (!!!), that was intolerable, inexplicable, incomprehensible—like there _wasn't a real world anymore_.
+So if the rationalists were going to get our own philosophy of language wrong over this _and Eliezer Yudkowsky was in on it_ (!!!), that was intolerable, inexplicable, incomprehensible—like there _wasn't a real world anymore_. I remember going downstairs to impulsively confide in a senior engineer, an older bald guy who exuded masculinity, who you could tell by his entire manner and being was not infected by the Berkeley mind-virus, no matter how loyally he voted Democrat—not just about the immediate impetus of this Twitter thread, but this whole _thing_ of the past couple years where my entire social circle just suddenly decided that guys like me could be women by means of saying so. He was noncommittally sympathetic; he told me an anecdote about him accepting a trans person's correction of his pronoun usage, with the thought that different people have their own beliefs, and that's OK.
  
-But if Yudkowsky was _already_ stonewalling his Twitter followers, entering the thread myself didn't seem likely to help. (Also, I hadn't intended to talk about gender on that account yet, although that seemed unimportant in light of the present cause for flipping out.)
+If Yudkowsky was _already_ stonewalling his Twitter followers, entering the thread myself didn't seem likely to help. (Also, I hadn't intended to talk about gender on that account yet, although that seemed unimportant in light of the present cause for flipping out.)
  
  It seemed better to try to clear this up in private. I still had Yudkowsky's email address. I felt bad bidding for his attention over my gender thing _again_—but I had to do _something_. Hands trembling, I sent him an email asking him to read my ["The Categories Were Made for Man to Make Predictions"](/2018/Feb/the-categories-were-made-for-man-to-make-predictions/), suggesting that it may qualify as an answer to his question about ["a page [he] could read to find a non-confused exclamation of how there's scientific truth at stake"](https://twitter.com/ESYudkowsky/status/1067482047126495232)—and that, because I cared very much about correcting what I claimed were confusions in my rationalist subculture, that I would be happy to pay up to $1000 for his time—and that, if he liked the post, he might consider Tweeting a link—and that I was cc'ing my friends Anna Salamon and Michael Vassar as a character reference (Subject: "another offer, $1000 to read a ~6500 word blog post about  (was: Re: Happy Price offer for a 2 hour conversation)"). Then I texted Anna and Michael begging them to chime in and vouch for my credibility.
  
  The monetary offer, admittedly, was awkward: I included another paragraph clarifying that any payment was only to get his attention, and not _quid quo pro_ advertising, and that if he didn't trust his brain circuitry not to be corrupted by money, then he might want to reject the offer on those grounds and only read the post if he expected it to be genuinely interesting.
  
-Again, I realize this must seem weird and cultish to any normal people reading this. (Paying some blogger you follow one grand just to _read_ one of your posts? What? Why? Who _does_ that?) To this, I again refer to [the reasons justifying my 2016 cheerful price offer](/2022/TODO/blanchards-dangerous-idea-and-the-plight-of-the-lucid-crossdreamer/#cheerful-price-reasons)—and that, along with tagging in Anna and Michael, who I thought Yudkowsky respected, it was a way to signal that I _really really really didn't want to be ignored_, which I assumed was the default outcome. Surely a simple person such as me was as a mere _worm_ in the presence of the great Eliezer Yudkowsky. I wouldn't have had the audacity to contact him at _all_, about _anything_, if I didn't have Something to Protect.
+Again, I realize this must seem weird and cultish to any normal people reading this. (Paying some blogger you follow one grand just to _read_ one of your posts? What? Why? Who _does_ that?) To this, I again refer to [the reasons justifying my 2016 cheerful price offer](/2022/TODO/blanchards-dangerous-idea-and-the-plight-of-the-lucid-crossdreamer/#cheerful-price-reasons)—and that, along with tagging in Anna and Michael, who I thought Yudkowsky respected, it was a way to signal that I _really really really didn't want to be ignored_, which I assumed was the default outcome. Surely an ordinary programmer such as me was as a mere _worm_ in the presence of the great Eliezer Yudkowsky. I wouldn't have had the audacity to contact him at _all_, about _anything_, if I didn't have Something to Protect.
  
  Anna didn't reply, but I apparently did interest Michael, who chimed in on the email thread to Yudkowsky. We had a long phone conversation the next day lamenting how the "rationalists" were dead as an intellectual community.
  
@@ -238,9 +238,9 @@ And the reason to write this as a desperate email plea to Scott Alexander when I
  
  Back in 2010, the rationalist community had a shared understanding that the function of language is to describe reality. Now, we didn't. If Scott didn't want to cite my creepy blog about my creepy fetish, that was _totally fine_; I liked getting credit, but the important thing is that this "No, the Emperor isn't naked—oh, well, we're not claiming that he's wearing any garments—it would be pretty weird if we were claiming _that!_—it's just that utilitarianism implies that the _social_ property of clothedness should be defined this way because to do otherwise would be really mean to people who don't have anything to wear" gaslighting maneuver needed to _die_, and he alone could kill it.
  
-... Scott didn't get it. We agreed that self-identity-, natal-sex-, and passing-based gender categories each had their own pros and cons, and that it's uninteresting to focus on whether something "really" belongs to a category, rather than on communicating what you mean. Scott took this to mean that what convention to use is a pragmatic choice that we can make on utilitarian grounds, and that being nice to trans people is worth a little bit of clunkiness.
+... Scott didn't get it. We agreed that self-identity-, natal-sex-, and passing-based gender categories each had their own pros and cons, and that it's uninteresting to focus on whether something "really" belongs to a category, rather than on communicating what you mean. Scott took this to mean that what convention to use is a pragmatic choice that we can make on utilitarian grounds, and that being nice to trans was worth a little bit of clunkiness, that the mental health benefits to trans people were obviously enough to tip the first-order uilitarian calculus.
  
-But I considered myself to be prosecuting _not_ the object-level question of which gender categories to use, but the meta-level question of what normative principles govern which categories we should use, for which, "whatever, it's a pragmatic choice, just be nice" wasn't an answer, because (I claimed) the principles exclude "just be nice" from being a relevant consideration. I didn't have a simple, [mistake-theoretic](https://slatestarcodex.com/2018/01/24/conflict-vs-mistake/) characterization of the language and social conventions that everyone should use such that anyone who defected from the compromise would be wrong. The best I could do was try to objectively predict the consequences of different possible conventions—and of _conflicts_ over possible conventions.
+I didn't think _anything_ about "mental health benefits to trans people" was obvious, but more importantly, I considered myself to be prosecuting _not_ the object-level question of which gender categories to use, but the meta-level question of what normative principles govern which categories we should use, for which (I claimed) "whatever, it's a pragmatic choice, just be nice" wasn't an answer, because (I claimed) the principles exclude "just be nice" from being a relevant consideration.
  
  ["... Not Man for the Categories"](https://slatestarcodex.com/2014/11/21/the-categories-were-made-for-man-not-man-for-the-categories/) had concluded with a section on Emperor Norton, a 19th century San Francisco resident who declared himself Emperor of the United States. Certainly, it's not difficult or costly for the citizens of San Francisco to _address_ Norton as "Your Majesty" as a courtesy or a nickname. But there's more to being the Emperor of the United States than people calling you "Your Majesty." Unless we abolish Congress and have the military enforce Norton's decrees, he's not _actually_ functioning in the role of emperor—at least not according to the currently generally-understood meaning of the word "emperor."
  
@@ -466,7 +466,7 @@ As it happened, I ran into Scott on the train that Friday, the twenty-second. He
  
  Ultimately, I think this was a pedagogy decision that Yudkowsky had gotten right back in 'aught-eight. If you write your summary slogan in relativist language, people predictably take that as license to believe whatever they want without having to defend it. Whereas if you write your summary slogan in objectivist language—so that people know they don't have social permission to say that "it's subjective so I can't be wrong"—then you have some hope of sparking useful thought about the _exact, precise_ ways that _specific, definite_ things are _in fact_ relative to other specific, definite things.
  
-I told him I would send him one more email with a piece of evidence about how other "rationalists" were thinking about the categories issue, and give my commentary on the parable about orcs, and then the present thread would probably drop there.
+I told Scott I would send him one more email with a piece of evidence about how other "rationalists" were thinking about the categories issue, and give my commentary on the parable about orcs, and then the present thread would probably drop there.
  
  On Discord in January, Kelsey Piper had told me that everyone else experienced their disagreement with me as being about where the joints are and which joints are important, where usability for humans was a legitimate criterion for importance, and it was annoying that I thought they didn't believe in carving reality at the joints at all and that categories should be whatever makes people happy.
  
@@ -507,123 +507,158 @@ Michael said this was importantly backwards: less precise targeting is more viol
  
  Polishing the advanced categories argument from earlier email drafts into a solid _Less Wrong_ post didn't take that long: by 6 April, I had an almost-complete draft of the new post, ["Where to Draw the Boundaries?"](https://www.lesswrong.com/posts/esRZaPXSHgWzyB2NL/where-to-draw-the-boundaries), that I was pretty happy with.
  
-The title (note: "boundaries", plural) was a play off of ["Where to the Draw the Boundary?"](https://www.lesswrong.com/posts/d5NyJ2Lf6N22AD9PB/where-to-draw-the-boundary) (note: "boundary", singular), a post from Yudkowsky's original Sequence on the 37 wayss in which words can be wrong. In "... Boundary?", Yudkowsky asserts (without argument, as something that all educated people already know) that dolphins don't form a natural category with fish ("Once upon a time it was thought that the word 'fish' included dolphins [...] you could stop playing nitwit games and admit that dolphins don't belong on the fish list"). But Alexander's ["... Not Man for the Categories"](https://slatestarcodex.com/2014/11/21/the-categories-were-made-for-man-not-man-for-the-categories/) directly contradicts this, asserting that there's nothing wrong with with biblical Hebrew word _dagim_ encompassing both fish and cetaceans (dolphins and whales). So who's right, Yudkowsky (2008) or Alexander (2014)? Is there a problem with dolphins being "fish", or not?
+The title (note: "boundaries", plural) was a play off of ["Where to the Draw the Boundary?"](https://www.lesswrong.com/posts/d5NyJ2Lf6N22AD9PB/where-to-draw-the-boundary) (note: "boundary", singular), a post from Yudkowsky's [original Sequence](https://www.lesswrong.com/s/SGB7Y5WERh4skwtnb) on the [37 ways in which words can be wrong](https://www.lesswrong.com/posts/FaJaCgqBKphrDzDSj/37-ways-that-words-can-be-wrong). In "... Boundary?", Yudkowsky asserts (without argument, as something that all educated people already know) that dolphins don't form a natural category with fish ("Once upon a time it was thought that the word 'fish' included dolphins [...] Or you could stop playing nitwit games and admit that dolphins don't belong on the fish list"). But Alexander's ["... Not Man for the Categories"](https://slatestarcodex.com/2014/11/21/the-categories-were-made-for-man-not-man-for-the-categories/) directly contradicts this, asserting that there's nothing wrong with with biblical Hebrew word _dagim_ encompassing both fish and cetaceans (dolphins and whales). So who's right, Yudkowsky (2008) or Alexander (2014)? Is there a problem with dolphins being "fish", or not?
  
  In "... Boundaries?", I unify the two positions and explain how both Yudkowsky and Alexander have a point: in high-dimensional configuration space, there's a cluster of finned water-dwelling animals in the subspace of the dimensions along which finned water-dwelling animals are similar to each other, and a cluster of mammals in the subspace of the dimensions along which mammals are similar to each other, and dolphins belong to _both_ of them. _Which_ subspace you pay attention to can legitimately depend on your values: if you don't care about predicting or controlling some particular variable, you have no reason to look for clusters along that dimension.
  
-But _given_ a subspace of interest, the _technical_ criterion of drawing category boundaries around [regions of high density in configuration space](https://www.lesswrong.com/posts/yLcuygFfMfrfK8KjF/mutual-information-and-density-in-thingspace) still applies. There is Law governing which uses of communication signals transmit which information, and the Law can't be brushed off with, "whatever, it's a pragmatic choice, just be nice." I demonstrate the Law with a couple of simple mathematical examples: if you redefine a codeword that originally pointed to one cluster, to also include another, that changes the quantitative predictions you make about an unobserved coordinate given the codeword; if an employer starts giving the title "Vice President" to line workers, that decreases the mutual information between the job title and properties of the job.
+But _given_ a subspace of interest, the _technical_ criterion of drawing category boundaries around [regions of high density in configuration space](https://www.lesswrong.com/posts/yLcuygFfMfrfK8KjF/mutual-information-and-density-in-thingspace) still applies. There is Law governing which uses of communication signals transmit which information, and the Law can't be brushed off with, "whatever, it's a pragmatic choice, just be nice." I demonstrate the Law with a couple of simple mathematical examples: if you redefine a codeword that originally pointed to one cluster in ℝ³, to also include another, that changes the quantitative predictions you make about an unobserved coordinate given the codeword; if an employer starts giving the title "Vice President" to line workers, that decreases the [mutual information](https://en.wikipedia.org/wiki/Mutual_information) between the job title and properties of the job.
  
-(Jessica and Ben's [discussion of the job title example in relation to the _Wikipedia_ summary of Jean Baudrillard's _Simulacra and Simulation_ ended up getting published separately](http://benjaminrosshoffman.com/excerpts-from-a-larger-discussion-about-simulacra/), and ended up taking on a life of its own [in](http://benjaminrosshoffman.com/blame-games/) [future](http://benjaminrosshoffman.com/blatant-lies-best-kind/) [posts](http://benjaminrosshoffman.com/simulacra-subjectivity/), [including](https://www.lesswrong.com/posts/Z5wF8mdonsM2AuGgt/negative-feedback-and-simulacra) [a](https://www.lesswrong.com/posts/NiTW5uNtXTwBsFkd4/signalling-and-simulacra-level-3) [number](https://www.lesswrong.com/posts/tF8z9HBoBn783Cirz/simulacrum-3-as-stag-hunt-strategy) [of](https://www.lesswrong.com/tag/simulacrum-levels) [posts](https://thezvi.wordpress.com/2020/05/03/on-negative-feedback-and-simulacra/) [by](https://thezvi.wordpress.com/2020/06/15/simulacra-and-covid-19/) [other](https://thezvi.wordpress.com/2020/08/03/unifying-the-simulacra-definitions/) [authors](https://thezvi.wordpress.com/2020/09/07/the-four-children-of-the-seder-as-the-simulacra-levels/).)
+(Jessica and Ben's [discussion of the job title example in relation to the _Wikipedia_ summary of Jean Baudrillard's _Simulacra and Simulation_ got published separately](http://benjaminrosshoffman.com/excerpts-from-a-larger-discussion-about-simulacra/), and ended up taking on a life of its own [in](http://benjaminrosshoffman.com/blame-games/) [future](http://benjaminrosshoffman.com/blatant-lies-best-kind/) [posts](http://benjaminrosshoffman.com/simulacra-subjectivity/), [including](https://www.lesswrong.com/posts/Z5wF8mdonsM2AuGgt/negative-feedback-and-simulacra) [a](https://www.lesswrong.com/posts/NiTW5uNtXTwBsFkd4/signalling-and-simulacra-level-3) [number](https://www.lesswrong.com/posts/tF8z9HBoBn783Cirz/simulacrum-3-as-stag-hunt-strategy) [of](https://www.lesswrong.com/tag/simulacrum-levels) [posts](https://thezvi.wordpress.com/2020/05/03/on-negative-feedback-and-simulacra/) [by](https://thezvi.wordpress.com/2020/06/15/simulacra-and-covid-19/) [other](https://thezvi.wordpress.com/2020/08/03/unifying-the-simulacra-definitions/) [authors](https://thezvi.wordpress.com/2020/09/07/the-four-children-of-the-seder-as-the-simulacra-levels/).)
  
  Sarah asked if the math wasn't a bit overkill: were the calculations really necessary to make the basic point that good definitions should be about classifying the world, rather than about what's pleasant or politically expedient to say? I thought the math was _really important_ as an appeal to principle—and [as intimidation](https://slatestarcodex.com/2014/08/10/getting-eulered/). (As it was written, [_the tenth virtue is precision!_](http://yudkowsky.net/rational/virtues/) Even if you cannot do the math, knowing that the math exists tells you that the dance step is precise and has no room in it for your whims.)
  
  "... Boundaries?" explains all this in the form of discourse with a hypothetical interlocutor arguing for the I-can-define-a-word-any-way-I-want position. In the hypothetical interlocutor's parts, I wove in verbatim quotes (without attribution) from Alexander ("an alternative categorization system is not an error, and borders are not objectively true or false") and Yudkowsky ("You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning", "Using language in a way _you_ dislike is not lying. The propositions you claim false [...] is not what the [...] is meant to convey, and this is known to everyone involved; it is not a secret"), and Bensinger ("doesn't unambiguously refer to the thing you're trying to point at").
  
-My thinking here was that the posse's previous email campaigns had been doomed to failure by being too closely linked to the politically-contentious object-level topic which reputable people had strong incentives not to touch with a ten-foot pole. So if I wrote this post _just_ explaining what was wrong with the claims Yudkowsky and Alexander had made about the philosophy of language, with perfectly innocent examples about dolphins and job titles, that would remove the political barrier and [leave a line of retreat](https://www.lesswrong.com/posts/3XgYbghWruBMrPTAL/leave-a-line-of-retreat) for Yudkowsky to correct the philosophy of language error. Then if someone with a threatening social-justicey aura were to say, "Wait, doesn't this contradict what you said about trans people earlier?", stonewall them. (Stonewall _them_ and not _me_!)
+My thinking here was that the posse's previous email campaigns had been doomed to failure by being too closely linked to the politically-contentious object-level topic which reputable people had strong incentives not to touch with a ten-foot pole. So if I wrote this post _just_ explaining what was wrong with the claims Yudkowsky and Alexander had made about the philosophy of language, with perfectly innocent examples about dolphins and job titles, that would remove the political barrier and [leave a line of retreat](https://www.lesswrong.com/posts/3XgYbghWruBMrPTAL/leave-a-line-of-retreat) for Yudkowsky to correct the philosophy of language error. And then if someone with a threatening social-justicey aura were to say, "Wait, doesn't this contradict what you said about trans people earlier?", the reputable people could stonewall them. (Stonewall _them_ and not _me_!)
  
-One reason someone might be reluctant to correct mistakes when pointed out, is the fear that such a policy could be abused by motivated nitpickers. It would be pretty annoying to be obligated to churn out an endless stream of trivial corrections by someone motivated to comb through your entire portfolio and point out every little thing you did imperfectly, ever.
+Another reason someone might be reluctant to correct mistakes when pointed out, is the fear that such a policy could be abused by motivated nitpickers. It would be pretty annoying to be obligated to churn out an endless stream of trivial corrections by someone motivated to comb through your entire portfolio and point out every little thing you did imperfectly, ever.
  
  I wondered if maybe, in Scott or Eliezer's mental universe, I was a blameworthy (or pitiably mentally ill) nitpicker for flipping out over a blog post from 2014 (!) and some Tweets (!!) from November. Like, really? I, too, had probably said things that were wrong _five years ago_.
  
-But, well, I thought I had made a pretty convincing that a lot of people are making a correctable and important rationality mistake, such that the cost of a correction (about the philosophy of language specifically, not any possible implications for gender politics) would actually be justified here. As Ben pointed out, if someone had put _this much_ effort into pointing out an error _I_ had made four months or five years ago and making careful arguments for why it was important to get the right answer, I probably _would_ put some serious thought into it.
+But, well, I thought I had made a pretty convincing case that a lot of people were making a correctable and important rationality mistake, such that the cost of a correction (about the philosophy of language specifically, not any possible implications for gender politics) would actually be justified here. As Ben pointed out, if someone had put _this much_ effort into pointing out an error _I_ had made four months or five years ago and making careful arguments for why it was important to get the right answer, I probably _would_ put some serious thought into it.
  
-I could see a case that it was unfair of me to include subtext and then expect people to engage with the text, but if we weren't going to get into full-on gender-politics on _Less Wrong_ (which seemed like a bad idea), but gender politics _was_ motivating an epistemology error, I wasn't sure what else I was supposed to do! I was pretty constrained here!
+I could see a case that it was unfair of me to include political subtext and then only expect people to engage with the politically-clean text, but if we weren't going to get into full-on gender-politics on _Less Wrong_ (which seemed like a bad idea), but gender politics _was_ motivating an epistemology error, I wasn't sure what else I was supposed to do! I was pretty constrained here!
  
-(I did regret having accidentally "poisoned the well" the previous month by impulsively sharing the previous year's ["Blegg Mode"](/2018/Feb/blegg-mode/) [as a _Less Wrong_ linkpost](https://www.lesswrong.com/posts/GEJzPwY8JedcNX2qz/blegg-mode). "Blegg Mode" had originally been drafted as part of "... To Make Predictions" before getting spun off as a separate post. Frustrated in March at our failing email campaign, I thought it was politically "clean" enough to belatedly share, but it proved to be insufficiently [deniably allegorical](/tag/deniably-allegorical/). It's plausible that some portion of the _Less Wrong_ audience would have been more receptive to "... Boundaries?" as not-politically-threatening philosophy, if they hadn't been alerted to the political context by the 60+-comment trainwreck on the "Blegg Mode" linkpost.)
+(I did regret having accidentally "poisoned the well" the previous month by impulsively sharing the previous year's ["Blegg Mode"](/2018/Feb/blegg-mode/) [as a _Less Wrong_ linkpost](https://www.lesswrong.com/posts/GEJzPwY8JedcNX2qz/blegg-mode). "Blegg Mode" had originally been drafted as part of "... To Make Predictions" before getting spun off as a separate post. Frustrated in March at our failing email campaign, I thought it was politically "clean" enough to belatedly share, but it proved to be insufficiently [deniably allegorical](/tag/deniably-allegorical/), as evidenced by the 60-plus-entry trainwreck of a comments section. It's plausible that some portion of the _Less Wrong_ audience would have been more receptive to "... Boundaries?" as not-politically-threatening philosophy, if they hadn't been alerted to the political context by the comments on the "Blegg Mode" linkpost.)
  
-On 13 April, I pulled the trigger on publishing "... Boundaries?", and wrote to Yudkowsky again, a fourth time (!), asking if he could _either_ publicly endorse the post, _or_ publicly comment on what he thought the post got right and what he thought it got wrong; and, that if engaging on this level was too expensive for him in terms of spoons, if there was any action I could take to somehow make it less expensive? The reason I thought this was important was that if rationalists in [good standing](https://srconstantin.wordpress.com/2018/12/24/contrite-strategies-and-the-need-for-standards/) find themselves in a persistent disagreement _about rationality itself_—in this case, my disagreement with Scott Alexander and others about the cognitive function of categories—that seemed like a major concern for [our common interest](https://www.lesswrong.com/posts/4PPE6D635iBcGPGRy/rationality-common-interest-of-many-causes), something we should be very eager to _definitively settle in public_ (or at least _clarify_ the current state of the disagreement). In the absence of an established "rationality court of last resort", I feared the closest thing we had was an appeal to Eliezer Yudkowsky's personal judgement. Despite the context in which the dispute arose, _this wasn't a political issue_. We had _nothing to be afraid of_ here. The post I was asking for his comment on was _just_ about the [_mathematical laws_](https://www.lesswrong.com/posts/eY45uCCX7DdwJ4Jha/no-one-can-exempt-you-from-rationality-s-laws) governing how to talk about, _e.g._, dolphins (Subject: "movement to clarity; or, rationality court filing").
+On 13 April, I pulled the trigger on publishing "... Boundaries?", and wrote to Yudkowsky again, a fourth time (!), asking if he could _either_ publicly endorse the post, _or_ publicly comment on what he thought the post got right and what he thought it got wrong; and, that if engaging on this level was too expensive for him in terms of spoons, if there was any action I could take to somehow make it less expensive? The reason I thought this was important, I explained, was that if rationalists in [good standing](https://srconstantin.github.io/2018/12/24/contrite-strategies-and-the-need-for-standards/) find themselves in a persistent disagreement _about rationality itself_—in this case, my disagreement with Scott Alexander and others about the cognitive function of categories—that seemed like a major concern for [our common interest](https://www.lesswrong.com/posts/4PPE6D635iBcGPGRy/rationality-common-interest-of-many-causes), something we should be very eager to _definitively settle in public_ (or at least _clarify_ the current state of the disagreement). In the absence of an established "rationality court of last resort", I feared the closest thing we had was an appeal to Eliezer Yudkowsky's personal judgement. Despite the context in which the dispute arose, _this wasn't a political issue_. The post I was asking for his comment on was _just_ about the [_mathematical laws_](https://www.lesswrong.com/posts/eY45uCCX7DdwJ4Jha/no-one-can-exempt-you-from-rationality-s-laws) governing how to talk about, _e.g._, dolphins. We had _nothing to be afraid of_ here. (Subject: "movement to clarity; or, rationality court filing").
  
  I got some pushback from Ben and Jessica about claiming that this wasn't "political". What I meant by that was to emphasize (again) that I didn't expect Yudkowsky or "the community" to take a public stance _on gender politics_; I was trying to get "us" to take a stance in favor of the kind of _epistemology_ that we were doing in 2008. It turns out that epistemology has implications for gender politics which are unsafe, but that's _more inferential steps_, and ... I guess I just didn't expect the sort of people who would punish good epistemology to follow the inferential steps?
  
-Anyway, again, without revealing any content from private conversations that may or may not have occurred, we did not get any public engagement from Yudkowsky.
+Anyway, again without revealing any content from the other side of any private conversations that may or may not have occurred, we did not get any public engagement from Yudkowsky.
  
  It seemed that the Category War was over, and we lost.
  
  We _lost?!_ How could we _lose?!_ The philosophy here was _very clear-cut_. This _shouldn't_ be hard or expensive or difficult to clear up. I could believe that Alexander was "honestly" confused, but Yudkowsky ...!?
  
-I could see how, under ordinary circumstances, asking Yudkowsky to weigh in on my post would be inappropriately demanding of a Very Important Person's time, given that a simple person such as me was surely as a mere _worm_ in the presence of the great Eliezer Yudkowsky.
+I could see how, under ordinary circumstances, asking Yudkowsky to weigh in on my post would be inappropriately demanding of a Very Important Person's time, given that an ordinary programmer such as me was surely as a mere _worm_ in the presence of the great Eliezer Yudkowsky. (I would have humbly given up much sooner if I hadn't gotten social proof from Michael and Ben and Sarah and secret posse member and Jessica.)
  
-(That's why the social proof from Michael + Ben + Sarah + Jessica + secret-posse-member was so essential.)
-
-But the only reason for my post to exist was because it would be even _more_ inappropriately demanding to ask for a clarification in the original gender-political context. I _don't_ think it was inappropriately demanding to expect "us" (him) to _be correct about the cognitive function of categorization_. (If not, why pretend to have a "rationality community" at all?) I was trying to be as accomodating as possible, given that decideratum.
+But the only reason for my post to exist was because it would be even _more_ inappropriately demanding to ask for a clarification in the original gender-political context. I _don't_ think it was inappropriately demanding to expect "us" (him) to _be correct about the cognitive function of categorization_. (If not, why pretend to have a "rationality community" at all?) I was _trying_ to be as accomodating as I could, short of just letting him (us?) be wrong.
  
  Jessica mentioned talking with someone about me writing to Yudkowsky and Alexander requesting that they clarify the category boundary thing. This person described having a sense that I should have known that wouldn't work—because of the politics involved, not because I wasn't right. I thought Jessica's takeaway was very poignant:
  
  > Those who are savvy in high-corruption equilibria maintain the delusion that high corruption is common knowledge, to justify expropriating those who naively don't play along, by narratizing them as already knowing and therefore intentionally attacking people, rather than being lied to and confused.
  
-_Should_ I have known that it wouldn't work? _Didn't_ I "already know", at some level? I guess in retrospect, the outcome does seem kind of "obvious"—that it should have been possible to predict in advance and make the corresponding update without so much fuss and wasting so many people's time.
+_Should_ I have known that it wouldn't work? _Didn't_ I "already know", at some level?
+
+I guess in retrospect, the outcome does seem kind of "obvious"—that it should have been possible to predict in advance, and to make the corresponding update without so much fuss and wasting so many people's time.
+
+But ... it's only "obvious" if you _take as a given_ that Yudkowsky is playing a savvy Kolmogorov complicity strategy like any other public intellectual in the current year. Maybe this seems banal if you haven't spent your entire adult life in his robot cult?
+
+But since I _did_ spend my entire adult life in his robot cult, trusting him the way a Catholic trusts the Pope, I _had_ to assume that the "hill of validity in defense of meaning" Twitter performance was an "honest mistake" in his rationality lessons, and that honest mistakes could be corrected if someone put in the effort to explain the problem. The idea that Eliezer Yudkowsky was going to behave just as badly as any other public intellectual in the current year, was not really in my hypothesis space. It took some _very large_ likelihood ratios to beat it into my head the thing that was obviously happenening, was actually happening.
  
-But ... it's only "obvious" if you _take as a given_ that Yudkowsky is playing a savvy Kolmogorov complicity strategy like any other public intellectual in the current year. Maybe this seems banal if you haven't spent your entire life in this robot cult? But the guy doesn't _market_ himself as being like any other public intellectual in the current year. As Ben put it, Yudkowsky's "claim to legitimacy really did amount to a claim that while nearly everyone else was criminally insane (causing huge amounts of damage due to disconnect from reality, in a way that would be criminal if done knowingly), he almost uniquely was not." Call me a sucker, but ... I _actually believed_ Yudkowsky's marketing story. The Sequences _really were just that good_. That's why it took so much fuss and wasted time to generate a likelihood ratio large enough to falsify that story.
+Ben shared the account of our posse's email campaign with someone, who commented that I had "sacrificed all hope of success in favor of maintaining his own sanity by CC'ing you guys." That is, if I had been brave enough to confront Yudkowsky by myself, _maybe_ there was some hope of him seeing that the game he was playing was wrong. But because I was so cowardly as to need social proof (because I believed that an ordinary programmer such as me was as a mere worm in the presence of the great Eliezer Yudkowsky), it must have just looked to him like an illegible social plot originating from Michael.
  
-Ben compared Yudkowsky to Eliza the spambot therapist in my story ["Blame Me for Trying"](/2018/Jan/blame-me-for-trying/). Scrupulous rationalists were paying rent to something claiming moral authority, which had no concrete specific plan to do anything other than run out the clock. Minds like mine don't surive long-run in this ecosystem. If we wanted minds that do "naïve" inquiry instead of playing savvy Kolmogorov games to survive, we needed an interior that justified that level of trust.
+One might wonder why this was such a big deal to us. Okay, so Yudkowsky had prevaricated about his own philosophy of language for transparently political reasons, and couldn't be moved to clarify in public even after me and my posse spent an enormous amount of effort trying to explain the problem. So what? Aren't people wrong on the internet all the time?
  
-[TODO: weave in "set in motion a machine" 19 Apr?]
+Ben explained: Yudkowsky had set in motion a marketing machine (the "rationalist community") that was continuing to raise funds and demand work from people for below-market rates based on the claim that while nearly everyone else was criminally insane (causing huge amounts of damage due to disconnect from reality, in a way that would be criminal if done knowingly), he, almost uniquely, was not. If the claim was _true_, it was important to make, and to actually extract that labor. "Work for me or the world ends badly," basically.
  
-[TODO Jack—
-> Zack sacrificed all hope of success in favor of maintaining his own sanity by CC'ing you guys (which I think he was correct to do conditional on email happening at all).]
+But we had just falsified to our satisfaction the claim that Yudkowsky was currently sane in the relevant way (which was a _extremely high_ standard, and not a special flaw of Yudkowsky in the current environment). If Yudkowsky couldn't be bothered to live up to his own stated standards or withdraw his validation from the machine he built after we had _tried_ to talk to him privately, then we had a right to talk in public about what we thought was going on.
+
+This wasn't about direct benefit _vs._ harm. This was about what, substantively, the machine was doing. They claimed to be cultivating an epistemically rational community, while in fact building an army of loyalists.
+
+Ben compared the whole set-up to that of Eliza the spambot therapist in my story ["Blame Me for Trying"](/2018/Jan/blame-me-for-trying/): regardless of the _initial intent_, scrupulous rationalists were paying rent to something claiming moral authority, which had no concrete specific plan to do anything other than run out the clock, maintaining a facsimile of dialogue in ways well-calibrated to continue to generate revenue. Minds like mine wouldn't surive long-run in this ecosystem. If we wanted minds that do "naïve" inquiry instead of playing savvy Kolmogorov games to survive, we needed an interior that justified that level of trust.
  
  -------
  
-curation hopes ... 22 Jun: I'm expressing a little bit of bitterness that a mole rats post got curated https://www.lesswrong.com/posts/fDKZZtTMTcGqvHnXd/naked-mole-rats-a-case-study-in-biological-weirdness
+Given that the "rationalists" were fake and that we needed something better, there remained the question of what to do about that, and how to relate to the old thing, and the maintainers of the marketing machine for the old thing.
  
-"Univariate fallacy" also a concession
-(which I got to cite in https://www.lesswrong.com/posts/cu7YY7WdgJBs3DpmJ/the-univariate-fallacy which I cited in "Schelling Categories")
+_I_ had been hyperfocused on prosecuting my Category War, but the reason Michael and Ben and Jessica were willing to help me out on that, was not because they particularly cared about the gender and categories example, but because it seemed like a manifestation of a _more general_ problem of epistemic rot in "the community". 
  
-https://slatestarcodex.com/2019/07/04/some-clarifications-on-rationalist-blogging/
+Ben had previously written a lot about problems with Effective Altruism. Jessica had had a bad time at MIRI, as she had told me back in March, and would [later](https://www.lesswrong.com/posts/KnQs55tjxWopCzKsk/the-ai-timelines-scam) [write](https://www.lesswrong.com/posts/MnFqyPLqbiKL8nSR7/my-experience-at-and-around-miri-and-cfar-inspired-by-zoe) [about](https://www.lesswrong.com/posts/pQGFeKvjydztpgnsY/occupational-infohazards). To what extent were my thing, and Ben's thing, and Jessica's thing, manifestations of "the same" underlying problem? Or had we all become disaffected with the mainstream "rationalists" for our own idiosyncratic reasons, and merely randomly fallen into each other's, and Michael's, orbit?
  
-"Yes Requires the Possibility of No" 19 May https://www.lesswrong.com/posts/WwTPSkNwC89g3Afnd/comment-section-from-05-19-2019
+I believed that there _was_ a real problem, but didn't feel like I had a good grasp on what it was specifically. Cultural critique is a fraught endeavor: if someone tells an outright lie, you can, maybe, with a lot of effort, prove that to other people, and get a correction on that specific point. (Actually, as we had just discovered, even that might be too much to hope for.) But _culture_ is the sum of lots and lots of little micro-actions by lots and lots of people. If your _entire culture_ has visibly departed from the Way that was taught to you in the late 'aughts, how do you demonstrate that to people who, to all appearances, are acting like they don't remember the old Way, or that they don't think anything has changed, or that they notice some changes but think the new way is better. It's not as simple as shouting, "Hey guys, Truth matters!"—any ideologue or religious person would agree with _that_.
  
-scuffle on LessWrong FAQ 31 May https://www.lesswrong.com/posts/MqrzczdGhQCRePgqN/feedback-requested-draft-of-a-new-about-welcome-page-for#iqEEme6M2JmZEXYAk
+Ben called it the Blight, after the rogue superintelligence in _A Fire Upon the Deep_: the problem wasn't that people were getting dumber; it's that there was locally coherent coordination away from clarity and truth and towards coalition-building, which was validated by the official narrative in ways that gave it a huge tactical advantage; people were increasingly making decisions that were better explained by their political incentives rather than acting on coherent beliefs about the world.
  
-"epistemic defense" meeting
+When I asked him for specific examples of MIRI or CfAR leaders behaving badly, he gave the example of MIRI executive director Nate Soares posting that he was "excited" about the launch of OpenAI, despite the fact that [_no one_ who had been following the AI risk discourse](https://slatestarcodex.com/2015/12/17/should-ai-be-open/) [thought that OpenAI as originally announced was a good idea](http://benjaminrosshoffman.com/openai-makes-humanity-less-safe/). Nate had privately clarified to Ben that the word "excited" wasn't necessarily meant positively, and in this case meant something more like "terrified." 
  
-[TODO section on factional conflict:
-Michael on Anna as cult leader
-Jessica told me about her time at MIRI (link to Zoe-piggyback and Occupational Infohazards)
-24 Aug: I had told Anna about Michael's "enemy combatants" metaphor, and how I originally misunderstood
-me being regarded as Michael's pawn
-assortment of agendas
-mutualist pattern where Michael by himself isn't very useful for scholarship (he just says a lot of crazy-sounding things and refuses to explain them), but people like Sarah and me can write intelligible things that secretly benefited from much less legible conversations with Michael.
-]
+This seemed to me like the sort of thing where a particularly principled (naive?) person might say, "That's _lying for political reasons!_ That's _contrary to the moral law!_" and most ordinary grown-ups would say, "Why are you so upset about this? That sort of strategic phrasing in press releases is just how the world works, and things could not possibly be otherwise."
  
-8 Jun: I think I subconsciously did an interesting political thing in appealing to my price for joining
+I thought explaining the Blight to an ordinary grown-up was going to need _either_ lots of specific examples that were way more egregious than this (and more egregious than the examples in "EA Has a Lying Problem" or ["Effective Altruism Is Self-Recommending"](http://benjaminrosshoffman.com/effective-altruism-is-self-recommending/)), or somehow convincing the ordinary grown-up why "just how the world works" isn't good enough, and why we needed one goddamned place in the entire goddamned world (perhaps a private place) with _unusually high standards_.
  
-REACH panel
+The schism introduced new pressures on my social life. On 20 April, I told Michael that I still wanted to be friends with people on both sides of the factional schism (in the frame where recent events were construed as a factional schism), even though I was on this side. Michael said that we should unambiguously regard Anna and Eliezer as criminals or enemy combatants (!!), that could claim no rights in regards to me or him.
  
-(Subject: "Michael Vassar and the theory of optimal gossip")
+I don't think I "got" the framing at this time. War metaphors sounded Scary and Mean: I didn't want to shoot my friends! But the point of the analogy (which Michael explained, but I wasn't ready to hear until I did a few more weeks of emotional processing) was specifically that soliders on the other side of a war _aren't_ particularly morally blameworthy as individuals: their actions are just being controlled by the Power they're embedded in.
  
+I wrote to Anna:
  
-Since arguing at the object level had failed (["... To Make Predictions"](/2018/Feb/the-categories-were-made-for-man-to-make-predictions/), ["Reply on Adult Human Females"](/2018/Apr/reply-to-the-unit-of-caring-on-adult-human-females/)), and arguing at the strictly meta level had failed (["... Boundaries?"](https://www.lesswrong.com/posts/esRZaPXSHgWzyB2NL/where-to-draw-the-boundaries)), the obvious thing to do next was to jump up to the meta-meta level and tell the story about why the "rationalists" were Dead To Me now, that [my price for joining](https://www.lesswrong.com/posts/Q8evewZW5SeidLdbA/your-price-for-joining) was not being met. (Just like Ben had suggested in December and in April.)
+> I was _just_ trying to publicly settle a _very straightforward_ philosophy thing that seemed _really solid_ to me
+>
+> if, in the process, I accidentally ended up being an unusually useful pawn in Michael Vassar's deranged four-dimensional hyperchess political scheming
+>
+> that's ... _arguably_ not my fault
+
+-----
+
+I may have subconsciously pulled off an interesting political thing. In my final email to Yudkowsky on 20 April (Subject: "closing thoughts from me"), I had written—
  
-I found it trouble to make progress on. I felt—constrained. I didn't know how to tell the story without (as I perceived it) escalating personal conflicts or leaking info from private conversations. So instead, I mostly turned to a combination of writing bitter and insulting comments whenever I saw someone praise "the rationalists" collectively, and—more philosophy-of-language blogging!
+> If we can't even get a public consensus from our _de facto_ leadership on something _so basic_ as "concepts need to carve reality at the joints in order to make probabilistic predictions about reality", then, in my view, there's _no point in pretending to have a rationalist community_, and I need to leave and go find something else to do (perhaps whatever Michael's newest scheme turns out to be). I don't think I'm setting [my price for joining](https://www.lesswrong.com/posts/Q8evewZW5SeidLdbA/your-price-for-joining) particularly high here?
  
-In August's ["Schelling Categories, and Simple Membership Tests"](https://www.lesswrong.com/posts/edEXi4SpkXfvaX42j/schelling-categories-and-simple-membership-tests), I explained a nuance that had only merited a passion mention in "... Boundaries?": sometimes you might want categories for different agents to _coordinate_ on, even at the cost of some statistical "fit." (This was of course generalized from a "pro-trans" argument that had occured to me, [that self-identity is an easy Schelling point when different people disagree about what "gender" they perceive someone as](/2019/Oct/self-identity-is-a-schelling-point/).)
+And as it happened, on 5 May, Yudkowsky reTweeted Colin Wright on the "univariate fallacy"—the point that group differences aren't a matter of any single variable—which was _sort of_ like the clarification I had been asking for. (Empirically, it made me feel a lot less personally aggrieved.) Was I wrong to interpet this as another "concession" to me? (Again, notwithstanding that the whole mindset of extracting "concessions" was corrupt and not what our posse was trying to do.)
  
-[TODO— more blogging 2019
+Separately, I visited some friends' house on 30 April saying, essentially (and sincerely), "[Oh man oh jeez](https://www.youtube.com/watch?v=NivwAQ8sUYQ), Ben and Michael want me to join in a rationalist civil war against the corrupt mainstream-rationality establishment, and I'd really rather not, and I don't like how they keep using scary hyperbolic words like 'cult' and 'war' and 'criminal', but on the other hand, they're _the only ones backing me up_ on this _incredibly basic philosophy thing_ and I don't feel like I have anywhere else to _go_." The ensuring group conversation made some progress, but was mostly pretty horrifying.
  
-"Algorithms of Deception!" Oct 2019
+In an adorable twist, my friends' two-year-old son was reportedly saying the next day that Kelsey doesn't like his daddy, which was confusing until it was figured out he had heard Kelsey talking about why she doesn't like Michael _Vassar_.
  
-"Maybe Lying Doesn't Exist" Oct 2019
+And as it happened, on 8 May, Kelsey wrote a Facebook comment displaying evidence of understanding my point.
  
-I was _furious_ at "Against Lie Inflation"—oh, so _now_ you agree that making language less useful is a problem?! But then I realized Scott actually was being consistent in his own frame: he's counting "everyone is angrier" (because of more frequent lying-accusations) as a cost; but, if everyone _is_ lying, maybe they should be angry!
+These two datapoints led me to a psychological hypothesis (which was maybe "obvious", but I hadn't thought about it before): when people see someone wavering between their coalition and a rival coalition, they're motivated to offer a few concessions to keep the wavering person on their side. Kelsey could _afford_ (_pace_ Upton Sinclair) to not understand the thing about sex being a natural category ("I don't think 'people who'd get surgery to have the ideal female body' cuts anything at the joints"!!) when it was just me freaking out alone, but "got it" almost as soon as I could credibly threaten to _walk_ (defect to a coalition of people she dislikes) ... and maybe my "closing thoughts" email had a similar effect on Yudkowsky (assuming he otherwise wouldn't have spontaneously tweeted something about the univariate fallacy two weeks later)?? This probably wouldn't work if you repeated it (or tried to do it consciously)?
  
-"Heads I Win" Sep 2019: I was surprised by how well this did (high karma, later included in the best-of-2019 collection); Ben and Jessica had discouraged me from bothering after I 
+----
  
-"Firming Up ..." Dec 2019: combatting Yudkowsky's not-technically-lying shenanigans
+I started drafting a "why I've been upset for five months and have lost faith in the so-called 'rationalist' community" personal-narrative Diary-like post. Ben said that the target audience to aim for was  people like I was a few years ago, who hadn't yet had the experiences I had—so they wouldn't have to freak out to the point of being imprisoned and demand help from community leaders and not get it; they could just learn from me. That is, the actual sympathetic-but-naïve people could learn. Not the people messing with me.
  
+I didn't know how to finish it. I was too psychologically constrained; I didn't know how to tell the Whole Dumb Story without (as I perceived it) escalating personal conflicts or leaking info from private conversations.
+
+I decided to take a break from the religious civil war [for a month](http://zackmdavis.net/blog/2019/05/may-is-math-and-wellness-month/) [or two](/2019/May/hiatus/).
+
+My dayjob performance had been suffering terribly for months. The psychology of the workplace is ... subtle. There's a phenomenon where some people are _way_ more productive than others and everyone knows it, but no one is cruel enough [to make it _common_ knowledge](https://slatestarcodex.com/2015/10/15/it-was-you-who-made-my-blue-eyes-blue/), which is awkward for people who simultaneously benefit from the culture of common-knowledge-prevention allowing them to collect the status and money rents of being a $150K/yr software engineer without actually [performing at that level](http://zackmdavis.net/blog/2013/12/fortune/), while also having [read enough Ayn Rand as a teenager](/2017/Sep/neither-as-plea-nor-as-despair/) to be ideologically opposed to subsisting on unjustly-acquired rents rather than value creation. The "everyone knows I feel guilty about underperforming, so they don't punish me because I'm already doing enough internalized domination to punish myself" dynamic would be unsustainable if it were to evolve into a loop of "feeling gulit _in exchange for_ not doing work" rather than the intended "feeling guilt in order to successfully incentivize work". I didn't think they would actually fire me, but I was worried that they _should_. I asked my boss to temporarily take on some easier tasks, that I could make steady progress on even while being psychologically impaired from a religious war. (We had a lot of LaTeX templating of insurance policy amendments that needed to get done.) If I was going to be psychologically impaired _anyway_, it was better to be upfront about how I could best serve the company given that impairment, rather than hoping that the boss wouldn't notice.
+
+My "intent" to take a break from the religious war didn't take.
+
+[TODO: tussle with Anna, was thinking of writing a public reply to her comment against Michael]
+
+[TODO: tussle on "Yes Implies the Possibility of No" https://www.lesswrong.com/posts/WwTPSkNwC89g3Afnd/comment-section-from-05-19-2019 ]
+
+[TODO: tussle on new _Less Wrong_ FAQ 31 May https://www.lesswrong.com/posts/MqrzczdGhQCRePgqN/feedback-requested-draft-of-a-new-about-welcome-page-for#iqEEme6M2JmZEXYAk ]
+
+[TODO: more philosophy-of-language blogging! and bitter grief comments
+https://www.greaterwrong.com/posts/tkuknrjYCbaDoZEh5/could-we-solve-this-email-mess-if-we-all-moved-to-paid/comment/ZkreTspP599RBKsi7
+https://www.greaterwrong.com/posts/FT9Lkoyd5DcCoPMYQ/partial-summary-of-debate-with-benquo-and-jessicata-pt-1/comment/vPekZcouSruiCco3c
  ]
  
+[TODO: 17– Jun, "LessWrong.com is dead to me" in response to "It's Not the Incentives", comment on Ray's behavior, "If clarity seems like death to them and like life to us"; Bill Brent, "Casual vs. Social Reality", I met with Ray 29 Jun; https://www.greaterwrong.com/posts/bwkZD6uskCQBJDCeC/self-consciousness-wants-to-make-everything-about-itself ; calling out the abstract pattern]
+
+[TODO: https://slatestarcodex.com/2019/07/04/some-clarifications-on-rationalist-blogging/]
+
+[TODO: "AI Timelines Scam", within-group debate on what is a "scam" or "fraud", Pope]
+
+[TODO: epistemic defense meeting; the first morning where "rationalists ... them" felt more natural than "rationalists ... us"]
+
+[TODO: Michael Vassar and the theory of optimal gossip; make sure to include the part about Michael threatening to sue]
+
+[TODO: various tussling with Steven Kaas]
  
  [TODO: Yudkowsky throwing NRx under the bus; tragedy of recursive silencing
  15 Sep Glen Weyl apology
  ]
  
-
  In November, I received an interesting reply on my philosophy-of-categorization thesis from MIRI researcher Abram Demski. Abram asked: ideally, shouldn't all conceptual boundaries be drawn with appeal-to-consequences? Wasn't the problem just with bad (motivated, shortsighted) appeals to consequences? Agents categorize in order to make decisions. The best classifer for an application depends on the costs and benefits. As a classic example, it's very important for evolved prey animals to avoid predators, so it makes sense for their predator-detection classifiers to be configured such that they jump away from every rustling in the bushes, even if it's usually not a predator.
  
  I had thought of the "false-positives are better than false-negatives when detecting predators" example as being about the limitations of evolution as an AI designer: messy evolved animal brains don't bother to track probability and utility separately the way a cleanly-designed AI could. As I had explained in "... Boundaries?", it made sense for _what_ variables you paid attention to, to be motivated by consequences. But _given_ the subspace that's relevant to your interests, you want to run an epistemically legitimate clustering algorithm on the data you see there, which depends on the data, not your values. The only reason value-dependent gerrymandered category boundaries seem like a good idea if you're not careful about philosophy is because it's _wireheading_. Ideal probabilistic beliefs shouldn't depend on consequences.
  
-Abram didn't think the issue was so clear-cut. Where do "probabilities" come from, in the first place? The reason we expect something like Bayesianism to be an attractor among self-improving agents is _because_ probabilistic reasoning is broadly useful: epistemology can be _derived_ from instrumental concerns. He agreed that severe wireheading issues _potentially_ arise if you allow consequentialist concerns to affect your epistemics—
+Abram didn't think the issue was so clear-cut. Where do "probabilities" come from, in the first place? The reason we expect something like Bayesianism to be an attractor among self-improving agents is _because_ probabilistic reasoning is broadly useful: epistemology can be _derived_ from instrumental concerns. He agreed that severe wireheading issues _potentially_ arise if you allow consequentialist concerns to affect your epistemics.
  
  But the alternative view had its own problems. If your AI consists of a consequentialist module that optimizes for utility in the world, and an epistemic module that optimizes for the accuracy of its beliefs, that's _two_ agents, not one: how could that be reflectively coherent? You could, perhaps, bite the bullet here, for fear that consequentialism doesn't tile and that wireheading was inevitable. On this view, Abram explained, "Agency is an illusion which can only be maintained by crippling agents and giving them a split-brain architecture where an instrumental task-monkey does all the important stuff while an epistemic overseer supervises." Whether this view was ultimately tenable or not, this did show that trying to forbid appeals-to-consequences entirely led to strange places. I didn't immediately have an answer for Abram, but I was grateful for the engagement. (Abram was clearly addressing the real philosophical issues, and not just trying to mess with me the way almost everyone else in Berkeley was trying to mess with me.)
  
@@ -647,7 +682,7 @@ I said I would bite that bullet: yes! Yes, I was trying to figure out whether I
  
  [TODO: plan to reach out to Rick]
  
-[TODO:
+[TODO: December tussle with Scott, and, a Christmas party—
  Scott replies on 21 December https://www.lesswrong.com/posts/bSmgPNS6MTJsunTzS/maybe-lying-doesn-t-exist?commentId=LJp2PYh3XvmoCgS6E
  
  > since these are not about factual states of the world (eg what the definition of "lie" REALLY is, in God's dictionary) we have nothing to make those decisions on except consequences
@@ -660,10 +695,8 @@ people reading funny GPT-2 quotes
  
  A MIRI researcher sympathetically told me that it would be sad if I had to leave the Bay Area, which I thought was nice. There was nothing about the immediate conversational context to suggest that I might have to leave the Bay, but I guess by this point, my existence had become a context.
  
-motivation deflates after Christmas victory
-5 Jan memoir as nuke
-]
-
+memoir motivation deflates after Christmas victory
+5 Jan memoir as nuke]
  
  -------
  
@@ -717,8 +750,7 @@ Given that I spent so many hours on this little research/writing project in earl
  https://slatestarcodex.com/2020/09/11/update-on-my-situation/
  ]
  
-[TODO: "out of patience" email
-
+[TODO: "out of patience" email]
  
  > To: Eliezer Yudkowsky <[redacted]>  
  > Cc: Anna Salamon <[redacted]>  
@@ -742,7 +774,7 @@ https://slatestarcodex.com/2020/09/11/update-on-my-situation/
  >
  > I agree that pronouns don't have the same function as ordinary nouns. However, **in the English language as actually spoken by native speakers, I think that gender pronouns _do_ have effective "truth conditions" _as a matter of cognitive science_.** If someone said, "Come meet me and my friend at the mall; she's really cool and you'll like her", and then that friend turned out to look like me, **you would be surprised**.
  > 
-> I don't see the _substantive_ difference between "You're not standing in defense of truth [...]" and "I can define a word any way I want." [...]
+> I don't see the _substantive_ difference between "You're not standing in defense of truth (...)" and "I can define a word any way I want." [...]
  >
  > [...]
  >
@@ -778,10 +810,9 @@ is make this simple thing established "rationalist" knowledge:
  [TODO: Sep 2020 categories clarification from EY—victory?!
  https://www.facebook.com/yudkowsky/posts/10158853851009228
  _ex cathedra_ statement that gender categories are not an exception to the rule, only 1 year and 8 months after asking for it
-
  ]
  
-[TODO: briefly mention breakup with Vassar group]
+[TODO: Sasha disaster, breakup with Vassar group]
  
  [TODO: "Unnatural Categories Are Optimized for Deception"
  
@@ -794,7 +825,6 @@ Embedded agency means that the AI shouldn't have to fundamentally reason differe
  somehow accuracy seems more fundamental than power or resources ... could that be formalized?
  ]
  
-
  And really, that _should_ have been the end of the story. At the trifling cost of two years of my life, we finally got a clarification from Yudkowsky that you can't define the word _woman_ any way you like. I didn't think I was entitled to anything more than that. I was satsified. I still published "Unnatural Categories Are Optimized for Deception" in January 2021, but if I hadn't been further provoked, I wouldn't have occasion to continue waging the robot-cult religious civil war.
  
  [TODO: NYT affair and Brennan link
@@ -1077,8 +1107,6 @@ Let's recap.
  * ...
  ]
  
-
-
  I _never_ expected to end up arguing about something so _trivial_ as the minutiae of pronoun conventions (which no one would care about if historical contingencies of the evolution of the English language hadn't made them a Schelling point and typographical attack surface for things people do care about). The conversation only ended up here after a series of derailings. At the start, I was _trying_ to say something substantive about the psychology of straight men who wish they were women.
  
  _After it's been pointed out_, it should be a pretty obvious hypothesis that "guy on the Extropians mailing list in 2004 who fantasizes about having a female counterpart" and "guy in 2016 Berkeley who identifies as a trans woman" are the _same guy_. 
@@ -1134,25 +1162,23 @@ Scott Alexander chose Feelings, but I can't really hold that against him, becaus
  Eliezer Yudkowsky ... did not _unambiguously_ choose Feelings. He's been very careful with his words to strategically mood-affiliate with the side of Feelings, without consciously saying anything that he knows to be unambiguously false.
  
  
-
-
-
+[TODO— finish Yudkowsky trying to be a religious leader
  Eliezer Yudkowsky is _absolutely_ trying to be a religious leader.
  
  If Eliezer Yudkowsky can't _unambigously_ choose Truth over Feelings, _then Eliezer Yudkowsky is a fraud_. 
  
  ]
  
-
-
-[TODO section stakes, cooperation
-
-at least Sabbatai Zevi had an excuse: his choices were to convert to Islam or be impaled https://en.wikipedia.org/wiki/Sabbatai_Zevi#Conversion_to_Islam
+[TODO section existential stakes, cooperation]
  
  > [_Perhaps_, replied the cold logic](https://www.yudkowsky.net/other/fiction/the-sword-of-good). _If the world were at stake._
  >
  > _Perhaps_, echoed the other part of himself, _but that is not what was actually happening._
  
+[TODO: social justice and defying threats
+
+at least Sabbatai Zevi had an excuse: his choices were to convert to Islam or be impaled https://en.wikipedia.org/wiki/Sabbatai_Zevi#Conversion_to_Islam
+]
  
  
  I like to imagine that they have a saying out of dath ilan: once is happenstance; twice is coincidence; _three times is hostile optimization_.
@@ -1269,23 +1295,16 @@ I don't doubt Yudkowsky could come up with some clever casuistry why, _technical
  
  [TODO: elaborate on how 2007!Yudkowsky and 2021!Xu are saying the opposite things if you just take a plain-language reading and consider, not whether individual sentences can be interpreted as "true", but what kind of _optimization_ the text is doing to the behavior of receptive readers]
  
+[TODO: body odor anecdote]
+
  [TODO: if he's reading this, win back respect— reply, motherfucker]
  
  [TODO: the Death With Dignity era
  
  "Death With Dignity" isn't really an update; he used to refuse to give a probability, and now he says the probability is ~0
  
-https://twitter.com/esyudkowsky/status/1164332124712738821
-> I unfortunately have had a policy for over a decade of not putting numbers on a few things, one of which is AGI timelines and one of which is *non-relative* doom probabilities.  Among the reasons is that my estimates of those have been extremely unstable.
-
-
-
  /2017/Jan/from-what-ive-tasted-of-desire/
  
  ]
  
-I don't, actually, know how to prevent the world from ending. Probably we were never going to survive. (The cis-human era of Earth-originating intelligent life wasn't going to last forever, and it's hard to exert detailed control over what comes next.) But if we're going to die either way, I think it would be _more dignified_ if Eliezer Yudkowsky were to behave as if he wanted his faithful students to be informed. Since it doesn't look like we're going to get that, I think it's _more dignified_ if his faithful students _know_ that he's not behaving like he wants us to be informed. And so one of my goals in telling you this long story about how I spent (wasted?) the last six years of my life, is to communicate the moral that 
-
-and that this is a _problem_ for the future of humanity, to the extent that there is a future of humanity.
-
-Is that a mean thing to say about someone to whom I owe so much? Probably. But he didn't create me to not say mean things. If it helps—as far as _I_ can tell, I'm only doing what he taught me to do in 2007–9: [carve reality at the joints](https://www.lesswrong.com/posts/esRZaPXSHgWzyB2NL/where-to-draw-the-boundaries), [speak the truth even if your voice trembles](https://www.lesswrong.com/posts/pZSpbxPrftSndTdSf/honesty-beyond-internal-truth), and [make an extraordinary effort](https://www.lesswrong.com/posts/GuEsfTpSDSbXFiseH/make-an-extraordinary-effort) when you've got [Something to Protect](https://www.lesswrong.com/posts/SGR4GxFK7KmW7ckCB/something-to-protect).
+[TODO: regrets and wasted time]