memoir: novices like me and Michelle Alleva ...

[Ultimately_Untrue_Thought.git] / content / drafts / a-hill-of-validity-in-defense-of-meaning.md
diff --git a/content/drafts/a-hill-of-validity-in-defense-of-meaning.md b/content/drafts/a-hill-of-validity-in-defense-of-meaning.md

index 0a93747..e63baaa 100644 (file)
--- a/content/drafts/a-hill-of-validity-in-defense-of-meaning.md
+++ b/content/drafts/a-hill-of-validity-in-defense-of-meaning.md
@@ -104,11 +104,11 @@ I think I _am_ standing in defense of truth if have an _argument_ for _why_ my p
  
  One could argue that this is unfairly interpreting Yudkowsky's Tweets as having a broader scope than was intended—that Yudkowsky _only_ meant to slap down the specific false claim that using "he" for someone with a Y chromosome is "lying", without intending any broader implications about trans issues or the philosophy of language. It wouldn't be realistic or fair to expect every public figure to host a truly exhaustive debate on all related issues every time a fallacy they encounter in the wild annoys them enough for them to Tweet about that specific fallacy.
  
-However, I don't think this "narrow" reading is the most natural one. Yudkowsky had previously written of what he called [the fourth virtue of evenness](http://yudkowsky.net/rational/virtues/): "If you are selective about which arguments you inspect for flaws, or how hard you inspect for flaws, then every flaw you learn how to detect makes you that much stupider." He had likewise written [of reversed stupidity](https://www.lesswrong.com/posts/qNZM3EGoE5ZeMdCRt/reversed-stupidity-is-not-intelligence) (bolding mine):
+However, I don't think this "narrow" reading is the most natural one. Yudkowsky had previously written of what he called [the fourth virtue of evenness](http://yudkowsky.net/rational/virtues/): "If you are selective about which arguments you inspect for flaws, or how hard you inspect for flaws, then every flaw you learn how to detect makes you that much stupider." He had likewise written [on reversed stupidity](https://www.lesswrong.com/posts/qNZM3EGoE5ZeMdCRt/reversed-stupidity-is-not-intelligence) (bolding mine):
  
  > **To argue against an idea honestly, you should argue against the best arguments of the strongest advocates**. Arguing against weaker advocates proves _nothing_, because even the strongest idea will attract weak advocates.
  
-Relatedly, Scott Alexander had written about how ["weak men are superweapons"](https://slatestarcodex.com/2014/05/12/weak-men-are-superweapons/): speakers often selectively draw attention to the worst arguments in favor of a position, in an attempt to socially discredit people who have better arguments for the position (which the speaker ignores). In the same way, by _just_ slapping down a weak man from the "anti-trans" political coalition without saying anything else in a similarly prominent location, Yudkowsky was liable to mislead his readers (who trusted him to argue against ideas honestly) into thinking that there were no better arguments from the "anti-trans" side.
+Relatedly, Scott Alexander had written about how ["weak men are superweapons"](https://slatestarcodex.com/2014/05/12/weak-men-are-superweapons/): speakers often selectively draw attention to the worst arguments in favor of a position, in an attempt to socially discredit people who have better arguments for the position (which the speaker ignores). In the same way, by _just_ slapping down a weak man from the "anti-trans" political coalition without saying anything else in a similarly prominent location, Yudkowsky was liable to mislead his faithful students (who trusted him to argue against ideas honestly) into thinking that there were no better arguments from the "anti-trans" side.
  
  To be sure, it imposes a cost on speakers to not be able to Tweet about one specific annoying fallacy and then move on with their lives without the need for [endless disclaimers](http://www.overcomingbias.com/2008/06/against-disclai.html) about related but stronger arguments that they're _not_ addressing. But the fact that [Yudkowsky disclaimed that](https://twitter.com/ESYudkowsky/status/1067185907843756032) he wasn't taking a stand for or against Twitter's anti-misgendering policy demonstrates that he _didn't_ have an aversion to spending a few extra words to prevent the most common misunderstandings.
  
@@ -126,13 +126,13 @@ Given the empirical reality of the different trait distributions, "Who are the b
  
  In light of these empirical observations, Yudkowsky's suggestion that an ignorant comittment to an "Aristotelian binary" is the main reason someone might care about the integrity of women's sports, is revealed as an absurd strawman. This just isn't something any scientifically-literate person would write if they had actually thought about the issue _at all_, as contrasted to having _first_ decided (consciously or not) to bolster one's reputation among progressives by dunking on transphobes on Twitter, and wielding one's philosophy knowledge in the service of that political goal. The relevant empirical facts are _not subtle_, even if most people don't have the fancy vocabulary to talk about them in terms of "multivariate trait distributions."
  
-I'm picking on the "sports segregated around an Aristotelian binary" remark because sports is a case where the relevant effect sizes are _so_ large as to make the point [hard for all but the most ardent gender-identity partisans to deny](/2017/Jun/questions-such-as-wtf-is-wrong-with-you-people/). (For example, what the [Cohen's _d_](https://en.wikipedia.org/wiki/Effect_size#Cohen's_d) ≈ 2.6 effect size difference in muscle mass means is that a woman as strong as the _average_ man is _at the 99.5th percentile_ for women.) But the point is very general: biological sex actually exists and is sometimes decision-relevant. People who want to be able to talk about sex and make policy decisions on the basis of sex are not making an ontology error, because the ontology in which sex "actually" "exists" continues to make very good predictions in our current tech regime.
+I'm picking on the "sports segregated around an Aristotelian binary" remark because sports is a case where the relevant effect sizes are _so_ large as to make the point [hard for all but the most ardent gender-identity partisans to deny](/2017/Jun/questions-such-as-wtf-is-wrong-with-you-people/). (For example, what the [Cohen's _d_](https://en.wikipedia.org/wiki/Effect_size#Cohen's_d) ≈ [2.6 effect size difference in muscle mass](/papers/janssen_et_al-skeletal_muscle_mass_and_distribution.pdf) means is that a woman as strong as the _average_ man is _at the 99.5th percentile_ for women.) But the point is very general: biological sex actually exists and is sometimes decision-relevant. People who want to be able to talk about sex and make policy decisions on the basis of sex are not making an ontology error, because the ontology in which sex "actually" "exists" continues to make very good predictions in our current tech regime. It would be an absurdly [isolated demand for rigor](http://slatestarcodex.com/2014/08/14/beware-isolated-demands-for-rigor/) to expect someone to pass a graduate exam about the cognitive function of categorization before they can talk about sex.
  
  Yudkowsky's claim to merely have been standing up for the distinction between facts and policy questions doesn't seem credible. It is, of course, true that pronoun and bathroom conventions are policy decisions rather than a matter of fact, but it's _bizarre_ to condescendingly point this out _as if it were the crux of contemporary trans-rights debates_. Conservatives and gender-critical feminists _know_ that trans-rights advocates aren't falsely claiming that trans women have XX chromosomes. If you _just_ wanted to point out that the organization of sports leagues is a policy question rather than a fact (as if anyone had doubted this), why would you throw in the "Aristotelian binary" strawman and belittle the matter as "humorous"? There are a lot of issues that I don't _personally_ care much about, but I don't see anything funny about the fact that other people _do_ care.
  
  If any concrete negative consequence of gender self-identity categories is going to be waved away with, "Oh, but that's a mere _policy_ decision that can be dealt with on some basis other than gender, and therefore doesn't count as an objection to the new definition of gender words", then it's not clear what the new definition is _for_.
  
-An illustration: like many gender-dysphoric males, I [cosplay](/2016/Dec/joined/) [female](/2017/Oct/a-leaf-in-the-crosswind/) [characters](/2019/Aug/a-love-that-is-out-of-anyones-control/) at fandom conventions sometimes. And, unfortunately, like many gender-dysphoric males, I'm _not very good at it_. I think someone looking at some of my cosplay photos and trying to describe their content in clear language—not trying to be nice to anyone or make a point, but just trying to use language as a map that reflects the territory—would say something like, "This is a photo of a man and he's wearing a dress." The word _man_ in that sentence is expressing _cognitive work_: it's a summary of the [lawful cause-and-effect evidential entanglement](https://www.lesswrong.com/posts/6s3xABaXKPdFwA3FS/what-is-evidence) whereby the photons reflecting off the photograph are correlated with photons reflecting off my body at the time the photo was taken, which are correlated with my externally-observable secondary sex characteristics (facial structure, beard shadow, _&c._), from which evidence an agent using an [efficient naïve-Bayes-like model](https://www.lesswrong.com/posts/gDWvLicHhcMfGmwaK/conditional-independence-and-naive-bayes) can assign me to its "man" category and thereby make probabilistic predictions about some of my traits that aren't directly observable from the photo, and achieve a better [score on those predictions](http://yudkowsky.net/rational/technical/) than if the agent had assigned me to its "adult human female" category, where by "traits" I mean not (just) particularly sex chromosomes ([as Yudkowsky suggested on Twitter](https://twitter.com/ESYudkowsky/status/1067291243728650243)), but the _conjunction_ of dozens or hundreds of measurements that are [_causally downstream_ of sex chromosomes](/2021/Sep/link-blood-is-thicker-than-water/): reproductive organs _and_ muscle mass (sex difference effect size of [Cohen's _d_](https://en.wikipedia.org/wiki/Effect_size#Cohen's_d) ≈ 2.6) _and_ Big Five Agreeableness (_d_ ≈ 0.5) _and_ Big Five Neuroticism (_d_ ≈ 0.4) _and_ short-term memory (_d_ ≈ 0.2, favoring women) _and_ white-to-gray-matter ratios in the brain _and_ probable socialization history _and_ [any number of other things](https://en.wikipedia.org/wiki/Sex_differences_in_human_physiology)—including differences we might not necessarily currently know about, but have prior reasons to suspect exist: no one _knew_ about sex chromosomes before 1905, but given all the other systematic differences between women and men, it would have been a reasonable guess (that turned out to be correct!) to suspect the existence of some sort of molecular mechanism of sex determination.
+An illustration: like many gender-dysphoric males, I [cosplay](/2016/Dec/joined/) [female](/2017/Oct/a-leaf-in-the-crosswind/) [characters](/2019/Aug/a-love-that-is-out-of-anyones-control/) at fandom conventions sometimes. And, unfortunately, like many gender-dysphoric males, I'm _not very good at it_. I think someone looking at some of my cosplay photos and trying to describe their content in clear language—not trying to be nice to anyone or make a point, but just trying to use language as a map that reflects the territory—would say something like, "This is a photo of a man and he's wearing a dress." The word _man_ in that sentence is expressing _cognitive work_: it's a summary of the [lawful cause-and-effect evidential entanglement](https://www.lesswrong.com/posts/6s3xABaXKPdFwA3FS/what-is-evidence) whereby the photons reflecting off the photograph are correlated with photons reflecting off my body at the time the photo was taken, which are correlated with my externally-observable secondary sex characteristics (facial structure, beard shadow, _&c._), from which evidence an agent using an [efficient naïve-Bayes-like model](https://www.lesswrong.com/posts/gDWvLicHhcMfGmwaK/conditional-independence-and-naive-bayes) can assign me to its "man" category and thereby make probabilistic predictions about some of my traits that aren't directly observable from the photo, and achieve a better [score on those predictions](http://yudkowsky.net/rational/technical/) than if the agent had assigned me to its "adult human female" category, where by "traits" I mean not (just) particularly sex chromosomes ([as Yudkowsky suggested on Twitter](https://twitter.com/ESYudkowsky/status/1067291243728650243)), but the _conjunction_ of dozens or hundreds of measurements that are [_causally downstream_ of sex chromosomes](/2021/Sep/link-blood-is-thicker-than-water/): reproductive organs _and_ muscle mass (sex difference effect size of [Cohen's _d_](https://en.wikipedia.org/wiki/Effect_size#Cohen's_d) ≈ 2.6) _and_ Big Five Agreeableness (_d_ ≈ 0.5) _and_ Big Five Neuroticism (_d_ ≈ 0.4) _and_ short-term memory (_d_ ≈ 0.2, favoring women) _and_ white-to-gray-matter ratios in the brain _and_ probable socialization history _and_ [any number of other things](/papers/archer-the_reality_and_evolutionary_significance_of_human_psychological_sex_differences.pdf)—including differences we might not necessarily currently know about, but have prior reasons to suspect exist: no one _knew_ about sex chromosomes before 1905, but given all the other systematic differences between women and men, it would have been a reasonable guess (that turned out to be correct!) to suspect the existence of some sort of molecular mechanism of sex determination.
  
  Forcing a speaker to say "trans woman" instead of "man" in that sentence depending on my verbally self-reported self-identity may not be forcing them to _lie_, exactly. (Because it's understood, "openly and explicitly and with public focus on the language and its meaning", what _trans women_ are; no one is making a false-to-fact claim about them having ovaries, for example.) But it _is_ forcing the speaker to obfuscate the probabilistic inference they were trying to communicate with the original sentence (about modeling the person in the photograph as being sampled from the "man" [cluster in configuration space](https://www.lesswrong.com/posts/WBw8dDkAWohFjWQSk/the-cluster-structure-of-thingspace)), and instead use language that suggests a different cluster-structure ("trans women", two words, are presumably a subcluster within the "women" cluster). Crowing in the public square about how people who object to being forced to "lie" must be ontologically confused is _ignoring the interesting part of the problem_. Gender identity's [claim to be non-disprovable](https://www.lesswrong.com/posts/fAuWLS7RKWD2npBFR/religion-s-claim-to-be-non-disprovable) mostly functions as a way to [avoid the belief's real weak points](https://www.lesswrong.com/posts/dHQkDNMhj692ayx78/avoiding-your-belief-s-real-weak-points).
  
@@ -260,9 +260,9 @@ Kelsey Piper replied, "[T]he people getting surgery to have bodies that do 'wome
  
  Another woman said, "'the original thing that already exists without having to try' sounds fake to me" (to the acclaim of 4 "+1" emoji reactions).
  
-The problem with this kind of exchange is not that anyone is being shouted down, nor that anyone is lying. The _problem_ is that people are motivatedly, ["algorithmically"](https://www.lesswrong.com/posts/sXHQ9R5tahiaXEZhR/algorithmic-intent-a-hansonian-generalized-anti-zombie) "playing dumb." I wish we had better terminology for this phenomenon, which is ubiquitous in human life. By "playing dumb", I don't mean that to suggest that Kelsey was _consciously_ thinking, "I'm playing dumb in order gain an advantage in this argument". I don't doubt that, _subjectively_, mentioning that cis women also get cosmetic surgery sometimes _felt like_ a relevant reply (because I had mentioned transition technology). It's just that, in context, I was very obviously trying to talk about the natural category of "biological sex", and Kelsey could have figured that out _if she had wanted to_.
+The problem with this kind of exchange is not that anyone is being shouted down, nor that anyone is lying. The _problem_ is that people are motivatedly, ["algorithmically"](https://www.lesswrong.com/posts/sXHQ9R5tahiaXEZhR/algorithmic-intent-a-hansonian-generalized-anti-zombie) "playing dumb." I wish we had more standard terminology for this phenomenon, which is ubiquitous in human life. By "playing dumb", I don't mean that to suggest that Kelsey was _consciously_ thinking, "I'm playing dumb in order gain an advantage in this argument". I don't doubt that, _subjectively_, mentioning that cis women also get cosmetic surgery sometimes _felt like_ a relevant reply (because I had mentioned transition technology). It's just that, in context, I was very obviously trying to talk about the natural category of "biological sex", and Kelsey could have figured that out _if she had wanted to_.
  
-It's not that anyone explicitly said, "Biological sex isn't real" in those words. ([The elephant in the brain](https://en.wikipedia.org/wiki/The_Elephant_in_the_Brain) knows it wouldn't be able to get away with _that_.) But if everyone correlatedly plays dumb whenever someone tries to _talk_ about sex in clear language in a context where that could conceivably hurt some trans person's feelings, I think what you have is a culture of _de facto_ biological sex denialism. ("'The original thing that already exists without having to try' sounds fake to me"!!)
+It's not that anyone explicitly said, "Biological sex isn't real" in those words. ([The elephant in the brain](https://en.wikipedia.org/wiki/The_Elephant_in_the_Brain) knows it wouldn't be able to get away with _that_.) But if everyone correlatedly plays dumb whenever someone tries to _talk_ about sex in clear language in a context where that could conceivably hurt some trans person's feelings, I think what you have is a culture of _de facto_ biological sex denialism. ("'The original thing that already exists without having to try' sounds fake to me"!!) It's not hard to get people to admit that trans women are different from cis women, but somehow they can't (in public, using words) follow the implication that trans women are different from cis women _because_ trans women are male.
  
  Ben thought I was wrong to think of this kind of behavior as non-ostracisizing. The deluge of motivated nitpicking _is_ an implied marginalization threat, he explained: the game people are playing when they do that is to force me to choose between doing arbitarily large amounts of interpretive labor, or being cast as never having answered these construed-as-reasonable objections, and therefore over time losing standing to make the claim, being thought of as unreasonable, not getting invited to events, _&c._
  
@@ -322,7 +322,7 @@ Without disclosing any specific content from private conversations with Yudkowsk
  
  Michael said that it seemed important that, if we thought Yudkowsky wasn't interested, we should have common knowledge among ourselves that we consider him to be choosing to be a cult leader.
  
-Meanwhile, my email thread with Scott got started back up again, although I wasn't expecting anything to come out of it. I expressed some regret that all the times I had emailed him over the past couple years had been when I was upset about something (like psych hospitals, or—something else) and wanted something from him, which was bad, because it was treating him as a means rather than an end—and then, despite that regret, continued prosecuting the argument.
+Meanwhile, my email thread with Scott got started back up again, although I wasn't expecting anything public to come out of it. I expressed some regret that all the times I had emailed him over the past couple years had been when I was upset about something (like psych hospitals, or—something else) and wanted something from him, which was bad, because it was treating him as a means rather than an end—and then, despite that regret, continued prosecuting the argument.
  
  One of Alexander's [most popular _Less Wrong_ posts ever had been about the noncentral fallacy, which Alexander called "the worst argument in the world"](https://www.lesswrong.com/posts/yCWPkLi8wJvewPbEp/the-noncentral-fallacy-the-worst-argument-in-the-world): for example, those who crow that abortion is _murder_ (because murder is the killing of a human being), or that Martin Luther King, Jr. was a _criminal_ (because he defied the segregation laws of the South), are engaging in a dishonest rhetorical maneuver in which they're trying to trick their audience into attributing attributes of the typical "murder" or "criminal" onto what are very noncentral members of those categories.
  
@@ -330,7 +330,7 @@ _Even if_ you're opposed to abortion, or have negative views about the historica
  
  In the form of a series of short parables, I tried to point out that Alexander's own "The Worst Argument in the World" is really complaining about the _same_ category-gerrymandering move that his "... Not Man for the Categories" comes out in favor of. We would not let someone get away with declaring, "I ought to accept an unexpected abortion or two deep inside the conceptual boundaries of what would normally not be considered murder if it'll save someone's life." Maybe abortion _is_ wrong and relevantly similar to the central sense of "murder", but you need to make that case _on the empirical merits_, not by linguistic fiat (Subject: "twelve short stories about language").
  
-... Scott still didn't get it. He said that he didn't see why he shouldn't accept one unit of categorizational awkwardness in exchange for sufficiently large utilitarian benefits. He made an analogy to some [Glowfic](https://www.glowfic.com/) lore, a story about orcs who had unwisely sworn a oath to serve the evil god Melkor. Though the orcs intend no harm of their own will, they're magically bound to obey Melkor's commands and serve as his terrible army or else suffer unbearable pain. Our heroine comes up with a solution: she founds a new religion featuring a deist God who also happens to be named Melkor. She convinces the orcs that since the oath didn't specify _which_ Melkor, they're free to follow her new God instead of evil-Melkor, and the magic making the oath binding apparently accepts this casuistry if the orc themelf does.
+... Scott still didn't get it. He said that he didn't see why he shouldn't accept one unit of categorizational awkwardness in exchange for sufficiently large utilitarian benefits. He made an analogy to some [Glowfic](https://www.glowfic.com/) lore, a story about orcs who had unwisely sworn a oath to serve the evil god Melkor. Though the orcs intend no harm of their own will, they're magically bound to obey Melkor's commands and serve as his terrible army or else suffer unbearable pain. Our heroine comes up with a solution: she founds a new religion featuring a deist God who also happens to be named Melkor. She convinces the orcs that since the oath didn't specify _which_ Melkor, they're free to follow her new God instead of evil-Melkor, and the magic making the oath binding apparently accepts this casuistry if the orcs themselves do.
  
  Scott's attitude towards the new interpretation of the oath in the story was analogous to his thinking about transgenderedness: sure, the new definition may be a little awkward and unnatural in some sense, but it's not literally objectively false, and it made life better for so many orcs. If [rationalists should win](https://www.lesswrong.com/posts/6ddcsdA2c2XpNpE5x/newcomb-s-problem-and-regret-of-rationality), then the true rationalist in this situation is the one who thought up this clever hack to save an entire species.
  
@@ -404,11 +404,11 @@ As such, we _shouldn't_ think that there are probably multiple kinds of gender d
  
  Had Yudkowsky been thinking that maybe if he Tweeted something favorable to my agenda, then me and the rest of Michael's gang would be satisfied and leave him alone?
  
-But ... if there's some _other_ reason you suspect there might be multiple species of dysphoria, but you _tell_ people your suspicion is because dysphoria has more than one proton, you're still misinforming people for political reasons, which was the _general_ problem we were trying to alert Yudkowsky to. (Someone who trusted you as a source of wisdom about rationality might try to apply your _fake_ "everything more complicated than protons tends to come in varieties" rationality lesson in some other context, and get the wrong answer.) Inventing fake rationality lessons in response to political pressure is _not okay_, and it still wasn't okay in this case just because in this case the political pressure happened to be coming from _me_.
+But ... if there's some _other_ reason you suspect there might be multiple species of dysphoria, but you _tell_ people your suspicion is because dysphoria has more than one proton, you're still misinforming people for political reasons, which was the _general_ problem we were trying to alert Yudkowsky to. (Someone who trusted you as a source of wisdom about rationality might try to apply your _fake_ "everything more complicated than protons tends to come in varieties" rationality lesson in some other context, and get the wrong answer.) Inventing fake rationality lessons in response to political pressure is _not okay_, and the fact that in this case the political pressure happened to be coming from _me_, didn't make it okay.
  
  I asked the posse if this analysis was worth sending to Yudkowsky. Michael said it wasn't worth the digression. He asked if I was comfortable generalizing from Scott's behavior, and what others had said about fear of speaking openly, to assuming that something similar was going on with Eliezer? If so, then now that we had common knowledge, we needed to confront the actual crisis, which was that dread was tearing apart old friendships and causing fanatics to betray everything that they ever stood for while its existence was still being denied.
  
-Another thing that happened that week was that former MIRI researcher Jessica Taylor joined our posse (being at an in-person meeting with Ben and Sarah and another friend on the seventeenth, and getting tagged in subsequent emails). Significantly for political purposes, Jessica is trans. We didn't have to agree up front on all gender issues for her to see the epistemology problem with "... Not Man for the Categories" and to say that maintaining a narcissistic fantasy by controlling category boundaries wasn't what _she_ wanted, as a trans person. (On the seventeenth, when I lamented the state of a world that incentivized us to be political enemies, her response was, "Well, we could talk about it first.") Michael said that me and Jessica together had more moral authority than either of us alone.
+Another thing that happened that week was that former MIRI researcher Jessica Taylor joined our posse (being at an in-person meeting with Ben and Sarah and another friend on the seventeenth, and getting tagged in subsequent emails). Significantly for political purposes, Jessica is trans. We didn't have to agree up front on all gender issues for her to see the epistemology problem with "... Not Man for the Categories", and to say that maintaining a narcissistic fantasy by controlling category boundaries wasn't what _she_ wanted, as a trans person. (On the seventeenth, when I lamented the state of a world that incentivized us to be political enemies, her response was, "Well, we could talk about it first.") Michael said that me and Jessica together had more moral authority than either of us alone.
  
  As it happened, I ran into Scott on the train that Friday, the twenty-second. He said that he wasn't sure why the oft-repeated moral of "A Human's Guide to Words" had been  "You can't define a word any way you want" rather than "You _can_ define a word any way you want, but then you have to deal with the consequences."
  
@@ -422,7 +422,7 @@ I [didn't want to bring it up at the time because](https://twitter.com/zackmdavi
  
  As for the parable about orcs, I thought it was significant that Scott chose to tell the story from the standpoint of non-orcs deciding what [verbal behaviors](https://www.lesswrong.com/posts/NMoLJuDJEms7Ku9XS/guessing-the-teacher-s-password) to perform while orcs are around, rather than the standpoint of the _orcs themselves_. For one thing, how do you _know_ that serving evil-Melkior is a life of constant torture? Is it at all possible, in the bowels of Christ, that someone has given you _misleading information_ about that? Moreover, you _can't_ just give an orc a clever misinterpretation of an oath and have them believe it. First you have to [cripple their _general_ ability](https://www.lesswrong.com/posts/XTWkjCJScy2GFAgDt/dark-side-epistemology) to correctly interpret oaths, for the same reason that you can't get someone to believe that 2+2=5 without crippling their _general_ ability to do arithmetic. We weren't not talking about a little "white lie" that the listener will never get to see falsified (like telling someone their dead dog is in heaven); the orcs _already know_ the text of the oath, and you have to break their ability to _understand_ it. Are you willing to permanently damage an orc's ability to reason, in order to save them pain? For some sufficiently large amount of pain, surely. But this isn't a choice to make lightly—and the choices people make to satisfy their own consciences, don't always line up with the volition of their alleged beneficiaries. We think we can lie to save others from pain, without ourselves _wanting to be lied to_. But behind the veil of ignorance, it's the same choice!
  
-I _also_ had more to say about philosophy of categories: I thought I could be more rigorous about the difference between "caring about predicting different variables" and "caring about consequences", in a way that Eliezer would _have_ to understand even if Scott didn't. (Scott had claimed that he could use gerrymandered categories and still be just as good at making predictions—but that's not true if we're talking about the _internal_ use of categories as a [cognitive algorithm](https://www.lesswrong.com/posts/HcCpvYLoSFP4iAqSz/rationality-appreciating-cognitive-algorithms), rather than mere verbal behavior: it's always easy to _say_ "_X_ is a _Y_" for arbitrary _X_ and _Y_ if the stakes demand it.)
+I _also_ had more to say about philosophy of categories: I thought I could be more rigorous about the difference between "caring about predicting different variables" and "caring about consequences", in a way that Eliezer would _have_ to understand even if Scott didn't. (Scott had claimed that he could use gerrymandered categories and still be just as good at making predictions—but that's just not true if we're talking about the _internal_ use of categories as a [cognitive algorithm](https://www.lesswrong.com/posts/HcCpvYLoSFP4iAqSz/rationality-appreciating-cognitive-algorithms), rather than mere verbal behavior: it's always easy to _say_ "_X_ is a _Y_" for arbitrary _X_ and _Y_ if the stakes demand it, but if you're _actually_ using that concept of _Y_ internally, that does have effects on your world-model.)
  
  But after consultation with the posse, I concluded that further email prosecution was not useful at this time; the philosophy argument would work better as a public _Less Wrong_ post. So my revised Category War to-do list was:
  
@@ -461,7 +461,7 @@ In "... Boundaries?", I unify the two positions and explain how both Yudkowsky a
  
  But _given_ a subspace of interest, the _technical_ criterion of drawing category boundaries around [regions of high density in configuration space](https://www.lesswrong.com/posts/yLcuygFfMfrfK8KjF/mutual-information-and-density-in-thingspace) still applies. There is Law governing which uses of communication signals transmit which information, and the Law can't be brushed off with, "whatever, it's a pragmatic choice, just be nice." I demonstrate the Law with a couple of simple mathematical examples: if you redefine a codeword that originally pointed to one cluster, to also include another, that changes the quantitative predictions you make about an unobserved coordinate given the codeword; if an employer starts giving the title "Vice President" to line workers, that decreases the mutual information between the job title and properties of the job.
  
-(Jessica and Ben's [discussion of the job title example in relation to the _Wikipedia_ summary of Jean Baudrillard's _Simulacra and Simulation_ ended up getting published separately](http://benjaminrosshoffman.com/excerpts-from-a-larger-discussion-about-simulacra/), and ended up taking on a life of its own in [future posts](http://benjaminrosshoffman.com/simulacra-subjectivity/), [including](https://thezvi.wordpress.com/2020/06/15/simulacra-and-covid-19/) by [other authors](https://thezvi.wordpress.com/2020/08/03/unifying-the-simulacra-definitions/).)
+(Jessica and Ben's [discussion of the job title example in relation to the _Wikipedia_ summary of Jean Baudrillard's _Simulacra and Simulation_ ended up getting published separately](http://benjaminrosshoffman.com/excerpts-from-a-larger-discussion-about-simulacra/), and ended up taking on a life of its own [in](http://benjaminrosshoffman.com/blame-games/) [future](http://benjaminrosshoffman.com/blatant-lies-best-kind/) [posts](http://benjaminrosshoffman.com/simulacra-subjectivity/), [including](https://www.lesswrong.com/posts/Z5wF8mdonsM2AuGgt/negative-feedback-and-simulacra) [a](https://www.lesswrong.com/posts/NiTW5uNtXTwBsFkd4/signalling-and-simulacra-level-3) [number](https://www.lesswrong.com/posts/tF8z9HBoBn783Cirz/simulacrum-3-as-stag-hunt-strategy) [of](https://www.lesswrong.com/tag/simulacrum-levels) [posts](https://thezvi.wordpress.com/2020/05/03/on-negative-feedback-and-simulacra/) [by](https://thezvi.wordpress.com/2020/06/15/simulacra-and-covid-19/) [other](https://thezvi.wordpress.com/2020/08/03/unifying-the-simulacra-definitions/) [authors](https://thezvi.wordpress.com/2020/09/07/the-four-children-of-the-seder-as-the-simulacra-levels/).)
  
  Sarah asked if the math wasn't a bit overkill: were the calculations really necessary to make the basic point that good definitions should be about classifying the world, rather than about what's pleasant or politically expedient to say? I thought the math was _really important_ as an appeal to principle—and [as intimidation](https://slatestarcodex.com/2014/08/10/getting-eulered/). (As it is written, [_the tenth virtue is precision!_](http://yudkowsky.net/rational/virtues/) Even if you cannot do the math, knowing that the math exists tells you that the dance step is precise and has no room in it for your whims.)
  
@@ -469,9 +469,9 @@ Sarah asked if the math wasn't a bit overkill: were the calculations really nece
  
  My thinking here was that the posse's previous email campaigns had been doomed to failure by being too closely linked to the politically-contentious object-level topic which reputable people had strong incentives not to touch with a ten-foot pole. So if I wrote this post _just_ explaining what was wrong with the claims Yudkowsky and Alexander had made about the philosophy of language, with perfectly innocent examples about dolphins and job titles, that would remove the political barrier and [leave a line of retreat](https://www.lesswrong.com/posts/3XgYbghWruBMrPTAL/leave-a-line-of-retreat) for Yudkowsky to correct the philosophy of language error. And then if someone with a threatening social-justicey aura were to say, "Wait, doesn't this contradict what you said about trans people earlier?", stonewall them. (Stonewall _them_ and not _me_!)
  
-I could see a case that it was unfair of me to include subtext and then expect people to engage with the text, but if we weren't going to get into full-on gender-politics on _Less Wrong_ (which seemed like a bad idea), but gender politics _was_ motivating an epistemology error, I wasn't sure what else I'm supposed to do! I was pretty constrained here!
+I could see a case that it was unfair of me to include subtext and then expect people to engage with the text, but if we weren't going to get into full-on gender-politics on _Less Wrong_ (which seemed like a bad idea), but gender politics _was_ motivating an epistemology error, I wasn't sure what else I was supposed to do! I was pretty constrained here!
  
-(I did regret having accidentally "poisoned the well" the previous month by impulsively sharing the previous year's ["Blegg Mode"](/2018/Feb/blegg-mode/) [as a _Less Wrong_ linkpost](https://www.lesswrong.com/posts/GEJzPwY8JedcNX2qz/blegg-mode). "Blegg Mode" had originally been drafted as part of "... To Make Predictions" before getting spun off as a separate post. Frustrated in March at our failing email campaign, I thought it was politically "clean" enough to belatedly share, but it proved to be insufficiently deniably allegorical. It's plausible that some portion of the _Less Wrong_ audience would have been more receptive to "... Boundaries?" as not-politically-threatening philosophy, if they hadn't been alerted to the political context by the trainwreck in the comments on the "Blegg Mode" linkpost.)
+(I did regret having accidentally "poisoned the well" the previous month by impulsively sharing the previous year's ["Blegg Mode"](/2018/Feb/blegg-mode/) [as a _Less Wrong_ linkpost](https://www.lesswrong.com/posts/GEJzPwY8JedcNX2qz/blegg-mode). "Blegg Mode" had originally been drafted as part of "... To Make Predictions" before getting spun off as a separate post. Frustrated in March at our failing email campaign, I thought it was politically "clean" enough to belatedly share, but it proved to be insufficiently [deniably allegorical](/tag/deniably-allegorical/). It's plausible that some portion of the _Less Wrong_ audience would have been more receptive to "... Boundaries?" as not-politically-threatening philosophy, if they hadn't been alerted to the political context by the 60+-comment trainwreck on the "Blegg Mode" linkpost.)
  
  -----
  
@@ -527,12 +527,31 @@ mutualist pattern where Michael by himself isn't very useful for scholarship (he
  15 Sep Glen Weyl apology
  ]
  
-[TODO: Ziz incident; more upset about gender validation than the felony charges, which were equally ridiculous and more obviously linked to physical violence
-complicity with injustice "Ziz isn't going to be a problem for you anymore"]
  
-[TODO: a culture that has gone off the rails; my warning points to Vaniver]
  
-[TODO: write to Ben about being stuck on memoir]
+In November, I received an interesting reply on my philosophy-of-categorization thesis from MIRI researcher Abram Demski. Abram asked: ideally, shouldn't all conceptual boundaries be drawn with appeal-to-consequences? Wasn't the problem just with bad (motivated, shortsighted) appeals to consequences? Agents categorize in order to make decisions. The best classifer for an application depends on the costs and benefits. As a classic example, it's very important for evolved prey animals to avoid predators, so it makes sense for their predator-detection classifiers to be configured such that they jump away from every rustling in the bushes, even if it's usually not a predator.
+
+I had thought of the "false-positives are better than false-negatives when detecting predators" example as being about the limitations of evolution as an AI designer: messy evolved animal brains don't bother to track probability and utility separately the way a cleanly-designed AI could. As I had explained in "... Boundaries?", it made sense for _what_ variables you paid attention to, to be motivated by consequences. But _given_ the subspace that's relevant to your interests, you want to run an epistemically legitimate clustering algorithm on the data you see there, which depends on the data, not your values. The only reason value-dependent gerrymandered category boundaries seem like a good idea if you're not careful about philosophy is because it's _wireheading_. Ideal probabilistic beliefs shouldn't depend on consequences.
+
+Abram didn't think the issue was so clear-cut. Where do "probabilities" come from, in the first place? The reason we expect something like Bayesianism to be an attractor among self-improving agents is _because_ probabilistic reasoning is broadly useful: epistemology can be _derived_ from instrumental concerns. He agreed that severe wireheading issues _potentially_ arise if you allow consequentialist concerns to affect your epistemics—
+
+But the alternative view had its own problems. If your AI consists of a consequentialist module that optimizes for utility in the world, and an epistemic module that optimizes for the accuracy of its beliefs, that's _two_ agents, not one: how could that be reflectively coherent? You could, perhaps, bite the bullet here, for fear that consequentialism doesn't tile and that wireheading was inevitable. On this view, Abram explained, "Agency is an illusion which can only be maintained by crippling agents and giving them a split-brain architecture where an instrumental task-monkey does all the important stuff while an epistemic overseer supervises." Whether this view was ultimately tenable or not, this did show that trying to forbid appeals-to-consequences entirely led to strange places. I didn't immediately have an answer for Abram, but I was grateful for the engagement. (Abram was clearly addressing the real philosophical issues, and not just trying to mess with me the way almost everyone else in Berkeley including up to and including Eliezer Yudkowsky  was trying to mess with me.)
+
+Also in November, I wrote to Ben about how I was still stuck on writing the grief-memoir. My _plan_ had been that it should have been possibly to tell the story of the Category War while glomarizing about the content of private conversations, then offer Scott and Eliezer pre-publication right of reply (because it's only fair to give your former-hero-current-[frenemies](https://en.wikipedia.org/wiki/Frenemy) warning when you're about to publicly characterize them as having been intellectually dishonest), then share it to _Less Wrong_ and the /r/TheMotte culture war thread, and then I would have the emotional closure to move on with my life (learn math, go to gym, chop wood, carry water) and not be a mentally-dominated cultist.
+
+The reason it _should_ be safe to write is because Explaining Things is Good. It should be possible to say, "This is not a social attack; I'm not saying 'rationalists Bad, Yudkowsky Bad'; I'm just trying to carefully _tell the true story_ about why, as a matter of cause-and-effect, I've been upset this year, including addressing counterarguments for why some would argue that I shouldn't be upset, why other people could be said to be behaving 'reasonably' given their incentives, why I nevertheless wish they'd be braver and adhere to principle rather than 'reasonably' following incentives, _&c_."
+
+So why couldn't I write? Was it that I didn't know how to make "This is not a social attack" credible? Maybe because it's wasn't true?? I was afraid that telling a story about our leader being intellectually dishonest was "the nuclear option" in a way that I couldn't credibly cancel with "But I'm just telling a true story about a thing that was important to me that actually happened" disclaimers. If you're slowly-but-surely gaining territory in a conventional war, _suddenly_ escalating to nukes seems pointlessly destructive. This metaphor is horribly non-normative ([arguing is not a punishment!](https://srconstantin.github.io/2018/12/15/argue-politics-with-your-best-friends.html) carefully telling a true story _about_ an argument is not a nuke!), but I didn't know how to make it stably go away.
+
+A more motivationally-stable compromise would be to try to split off whatever _generalizable insights_ that would have been part of the story into their own posts that don't make it personal. ["Heads I Win, Tails?—Never Heard of Her"](https://www.lesswrong.com/posts/DoPo4PDjgSySquHX8/heads-i-win-tails-never-heard-of-her-or-selective-reporting) had been a huge success as far as I was concerned, and I could do more of that kind of thing, analyzing the social stuff I was worried about, without making it personal, even if, secretly, it actually was personal.
+
+Ben replied that it didn't seem like it was clear to me that I was a victim of systemic abuse, and that I was trying to figure out whether I was being fair to my abuser. He thought if I could internalize that, I would be able to forgive myself a lot of messiness, which would reduce the perceived complexity of the problem.
+
+I said I would bite that bullet: yes! Yes, I was trying to figure out whether I was being fair to my abusers, and it was an important question to get right! "Other people's lack of standards harmed me, therefore I don't need to hold myself to standards in my response because I have [extenuating circumstances](https://www.lesswrong.com/posts/XYrcTJFJoYKX2DxNL/extenuating-circumstances)" would be a _lame excuse_.
+
+(This seemed correlated with the recurring stalemated disagreement within our coordination group, where Michael/Ben/Jessica would say, "Fraud, if that word _ever_ meant anything", and while I agreed that they were pointing to an important way in which things were messed up, I was still sympathetic to the Caliphate defender's reply that the Vassarite usage of "fraud" was motte-and-baileying between vastly different senses of _fraud_; I wanted to do _more work_ to formulate a _more precise theory_ of the psychology of deception to describe exactly how things are messed up a way that wouldn't be susceptible to the motte-and-bailey charge.)
+
+[TODO: a culture that has gone off the rails; my warning points to Vaniver]
  
  [TODO: plan to reach out to Rick]
  
@@ -555,9 +574,27 @@ There's another very important part of the story that would fit around here chro
  
  [TODO: theorizing about on the margin]
  
+[TODO: "Autogenderphilia Is Common"]
+
+[TODO: help from Jessica for "Unnatural Categories"]
+
  [TODO: "out of patience" email]
  [TODO: Sep 2020 categories clarification from EY—victory?!]
  
+[TODO: briefly mention breakup with Vassar group]
+
+[TODO: "Unnatural Categories Are Optimized for Deception"
+
+Abram was right
+
+the fact that it didn't means that not tracking it can be an effective AI design! Just because evolution takes shortcuts that human engineers wouldn't doesn't mean shortcuts are "wrong" (instead, there are laws governing which kinds of shortcuts work).
+
+Embedded agency means that the AI shouldn't have to fundamentally reason differently about "rewriting code in some 'external' program" and "rewriting 'my own' code." In that light, it makes sense to regard "have accurate beliefs" as merely a convergent instrumental subgoal, rather than what rationality is about
+
+somehow accuracy seems more fundamental than power or resources ... could that be formalized?
+]
+
+
  [TODO: That should have been the end of the story, but then—he revisited the pronouns issue!!!]
  
  [TODO: based on the timing, the Feb. 2021 pronouns post was likely causally downstream of me being temporarily more salient to EY because of my highly-Liked response to his "anyone at this point that anybody who openly hates on this community generally or me personally is probably also a bad person inside" from 17 February; it wasn't gratuitously out of the blue]
@@ -607,7 +644,7 @@ Again, as discussed in "Challenges to Yudkowsky's Pronoun Reform Proposal", a co
  
  It's quite another thing altogether to _simultaneously_ try to prevent a speaker from using _tú_ to indicate disrespect towards a social superior (on the stated rationale that the _tú_/_usted_ distinction is dumb and shouldn't exist), while _also_ refusing to entertain or address the speaker's arguments explaining _why_ they think their interlocutor is unworthy of the deference that would be implied by _usted_ (because such arguments are "unspeakable" for political reasons). That's just psychologically abusive.
  
-If Yudkowsky _actually_ possessed (and felt motivated to use) the "ability to independently invent everything important that would be on the other side of the filter and check it [himself] before speaking", it would be _obvious_ to him that "Gendered Pronouns For Everyone and Asking To Leave The System Is Lying" isn't the hill anyone would care about dying on if it weren't a Schelling point. A lot of TERF-adjacent folk would be _overjoyed_ to concede the (boring, insubstantial) matter of pronouns as a trivial courtesy if it meant getting to _actually_ address their real concerns of "Biological Sex Actually Exists", and ["Biological Sex Cannot Be Changed With Existing or Foreseeable Technology"](https://www.lesswrong.com/posts/QZs4vkC7cbyjL9XA9/changing-emotions) and "Biological Sex Is Sometimes More Relevant Than Self-Declared Gender Identity." The reason so many of them are inclined to stand their ground and not even offer the trivial courtesy is because they suspect, correctly, that the matter of pronouns is being used as a rhetorical wedge to try to prevent people from talking or thinking about sex.
+If Yudkowsky _actually_ possessed (and felt motivated to use) the "ability to independently invent everything important that would be on the other side of the filter and check it [himself] before speaking", it would be _obvious_ to him that "Gendered Pronouns For Everyone and Asking To Leave The System Is Lying" isn't the hill anyone would care about dying on if it weren't a Schelling point. A lot of TERF-adjacent folk would be _overjoyed_ to concede the (boring, insubstantial) matter of pronouns as a trivial courtesy if it meant getting to _actually_ address their real concerns of "Biological Sex Actually Exists", and ["Biological Sex Cannot Be Changed With Existing or Foreseeable Technology"](https://www.lesswrong.com/posts/QZs4vkC7cbyjL9XA9/changing-emotions) and "Biological Sex Is Sometimes More Relevant Than Subjective Gender Identity." The reason so many of them are inclined to stand their ground and not even offer the trivial courtesy is because they suspect, correctly, that the matter of pronouns is being used as a rhetorical wedge to try to prevent people from talking or thinking about sex.
  
  Having analyzed the _ways_ in which Yudkowsky is playing dumb here, what's still not entirely clear is _why_. Presumably he cares about maintaining his credibility as an insightful and fair-minded thinker. Why tarnish that by putting on this haughty performance?
  
@@ -666,15 +703,15 @@ If the idea of being fired from the Snodgrass campaign or being unpopular with p
  
  I see the phrase "bad faith" thrown around more than I think people know what it means. "Bad faith" doesn't mean "with ill intent", and it's more specific than "dishonest": it's [adopting the surface appearance of being moved by one set of motivations, while actually acting from another](https://en.wikipedia.org/wiki/Bad_faith).
  
-For example, an [insurance company employee](https://en.wikipedia.org/wiki/Claims_adjuster) who goes through the motions of investigating your claim while privately intending to deny it might never consciously tell an explicit "lie", but is definitely acting in bad faith: they're asking you questions, demanding evidence, _&c._ in order to _make it look like_ you'll get paid if you prove the loss occurred—whereas in reality, you're just not going to be paid. Your responses to the claim inspector aren't completely casually _inert_: if you can make an extremely strong case that the loss occurred as you say, then the claim inspector might need to put some effort into coming up with some ingenious excuse to deny your claim in ways that exhibit general claim-inspection principles. But at the end of the day, the inspector is going to say what they need to say in order to protect the company's loss ratio, as is personally prudent.
+For example, an [insurance company employee](https://en.wikipedia.org/wiki/Claims_adjuster) who goes through the motions of investigating your claim while privately intending to deny it might never consciously tell an explicit "lie", but is definitely acting in bad faith: they're asking you questions, demanding evidence, _&c._ in order to _make it look like_ you'll get paid if you prove the loss occurred—whereas in reality, you're just not going to be paid. Your responses to the claim inspector aren't completely casually _inert_: if you can make an extremely strong case that the loss occurred as you say, then the claim inspector might need to put some effort into coming up with some ingenious excuse to deny your claim, in ways that exhibit general claim-inspection principles. But at the end of the day, the inspector is going to say what they need to say in order to protect the company's loss ratio, as is sometimes personally prudent.
  
  With this understanding of bad faith, we can read Yudkowsky's "it is sometimes personally prudent [...]" comment as admitting that his behavior on politically-charged topics is in bad faith—where "bad faith" isn't a meaningless insult, but [literally refers](http://benjaminrosshoffman.com/can-crimes-be-discussed-literally/) to the pretending-to-have-one-set-of-motivations-while-acting-according-to-another behavior, such that accusations of bad faith can be true or false. Yudkowsky will take care not to consciously tell an explicit "lie", while going through the motions to _make it look like_ he's genuinely engaging with questions where I need the right answers in order to make extremely impactful social and medical decisions—whereas in reality, he's only going to address a selected subset of the relevant evidence and arguments that won't get him in trouble with progressives.
  
  To his credit, he _will_ admit that he's only willing to address a selected subset of arguments—but while doing so, he claims an absurd "confidence in [his] own ability to independently invent everything important that would be on the other side of the filter and check it [himself] before speaking" while _simultaneously_ blatantly mischaracterizing his opponents' beliefs! ("Gendered Pronouns For Everyone and Asking To Leave The System Is Lying" doesn't pass anyone's [ideological Turing test](https://www.econlib.org/archives/2011/06/the_ideological.html).)
  
-Counterarguments aren't completely causally _inert_: if you can make an extremely strong case that Biological Sex Is Sometimes More Relevant Than Self-Declared Gender Identity, Yudkowsky will put some effort into coming up with some ingenious excuse for why he _technically_ never said otherwise, in ways that exhibit generally rationalist principles. But at the end of the day, Yudkowsky is going to say what he needs to say in order to protect his reputation, as is personally prudent.
+Counterarguments aren't completely causally _inert_: if you can make an extremely strong case that Biological Sex Is Sometimes More Relevant Than Self-Declared Gender Identity, Yudkowsky will put some effort into coming up with some ingenious excuse for why he _technically_ never said otherwise, in ways that exhibit generally rationalist principles. But at the end of the day, Yudkowsky is going to say what he needs to say in order to protect his reputation, as is sometimes personally prudent.
  
-Even if one were to agree with this description of Yudkowsky's behavior, it doesn't immediately follow that Yudkowsky is making the wrong decision. Again, "bad faith" is meant as a literal description that makes predictions about behavior, not a contentless attack—maybe there are some circumstances in which engaging some amount of bad faith is the right thing to do, given the constraints one faces! For example, when talking to people on Twitter with a very different ideological background from me, I sometimes anticipate that if my interlocutor knew what I was actually thinking, they wouldn't want to talk to me, so I engage in a bit of what could be called ["concern trolling"](https://geekfeminism.fandom.com/wiki/Concern_troll): I take care to word my replies in a way that makes it look like I'm more ideologically aligned with them than I actually am. (For example, I [never say "assigned female/male at birth" in my own voice on my own platform](/2019/Sep/terminology-proposal-developmental-sex/), but I'll do it in an effort to speak my interlocutor's language.) I think of this as the _minimal_ amount of strategic bad faith needed to keep the conversation going, to get my interlocutor to evaluate my argument on its own merits, rather than rejecting it for coming from an ideological enemy. In cases such as these, I'm willing to defend my behavior as acceptable—there _is_ a sense in which I'm being deceptive by optimizing my language choice to make my interlocutor make bad guesses about my ideological alignment, but I'm comfortable with that amount and scope of deception in the service of correcting the distortion where I don't think my interlocutor _should_ be paying attention to my personal alignment.
+Even if one were to agree with this description of Yudkowsky's behavior, it doesn't immediately follow that Yudkowsky is making the wrong decision. Again, "bad faith" is meant as a literal description that makes predictions about behavior, not a contentless attack—maybe there are some circumstances in which engaging some amount of bad faith is the right thing to do, given the constraints one faces! For example, when talking to people on Twitter with a very different ideological background from me, I sometimes anticipate that if my interlocutor knew what I was actually thinking, they wouldn't want to talk to me, so I occasionally engage in a bit of what could be called ["concern trolling"](https://geekfeminism.fandom.com/wiki/Concern_troll): I take care to word my replies in a way that makes it look like I'm more ideologically aligned with them than I actually am. (For example, I [never say "assigned female/male at birth" in my own voice on my own platform](/2019/Sep/terminology-proposal-developmental-sex/), but I'll do it in an effort to speak my interlocutor's language.) I think of this as the _minimal_ amount of strategic bad faith needed to keep the conversation going, to get my interlocutor to evaluate my argument on its own merits, rather than rejecting it for coming from an ideological enemy. In cases such as these, I'm willing to defend my behavior as acceptable—there _is_ a sense in which I'm being deceptive by optimizing my language choice to make my interlocutor make bad guesses about my ideological alignment, but I'm comfortable with that amount and scope of deception in the service of correcting the distortion where I don't think my interlocutor _should_ be paying attention to my personal alignment.
  
  That is, my bad faith concern-trolling gambit of deceiving people about my ideological alignment in the hopes of improving the discussion seems like something that makes our collective beliefs about the topic-being-argued-about _more_ accurate. (And the topic-being-argued-about is presumably of greater collective interest than which "side" I personally happen to be on.)
  
@@ -694,23 +731,23 @@ But if you _actually cared_ about not deceiving your readers, you would want to
  
  "[P]eople do _know_ they're living in a half-Stalinist environment," Yudkowsky says. "I think people are better off at the end of that," he says. But who are "people", specifically? One of the problems with utilitarianism is that it doesn't interact well with game theory. If a policy makes most people better off, at the cost of throwing a few others under the bus, is it the right thing to do? Depending on the details, maybe! But you probably shouldn't expect the victims to meekly go under the wheels without a fight. That's why I'm telling you this 50,000-word sob story about how _I_ didn't know, and _I'm_ not better off.
  
-In [one of Yudkowsky's roleplaying fiction threads](https://www.glowfic.com/posts/4508), Thellim, a woman hailing from [a saner alternate version of Earth called dath ilan](https://www.lesswrong.com/tag/dath-ilan), [expresses horror and disgust at how shallow and superficial the characters in _Pride and Prejudice_ are, in contrast to what a human being _should_ be](https://www.glowfic.com/replies/1592898#reply-1592898):
+In [one of Yudkowsky's roleplaying fiction threads](https://www.glowfic.com/posts/4508), Thellim, a woman hailing from [a saner alternate version of Earth called dath ilan](https://www.lesswrong.com/tag/dath-ilan), [expresses horror and disgust at how shallow and superficial the characters in Jane Austen's _Pride and Prejudice_ are, in contrast to what a human being _should_ be](https://www.glowfic.com/replies/1592898#reply-1592898):
  
  > [...] the author has made zero attempt to even try to depict Earthlings as having reflection, self-observation, a fire of inner life; most characters in _Pride and Prejudice_ bear the same relationship to human minds as a stick figure bears to a photograph. People, among other things, have the property of trying to be people; the characters in Pride and Prejudice have no visible such aspiration. Real people have concepts of their own minds, and contemplate their prior ideas of themselves in relation to a continually observed flow of their actual thoughts, and try to improve both their self-models and their selves. It's impossible to imagine any of these people, even Elizabeth, as doing that thing Thellim did a few hours ago, where she noticed she was behaving like Verrez and snapped out of it. Just like any particular Verrez always learns to notice he is being Verrez and snap out of it, by the end of any of his alts' novels.
  
  When someone else doesn't see the problem with Jane Austen's characters, Thellim [redoubles her determination to explain the problem](https://www.glowfic.com/replies/1592987#reply-1592987): "_She is not giving up that easily. Not on an entire planet full of people._"
  
-Thellim's horror at the fictional world of Jane Austen is basically how I feel about trans culture. It _actively discourages self-modeling!_ People who have cross-sex fantasies are encouraged to reify them into a gender identity which everyone else is supposed to unquestioningly accept. Obvious critical questions about what's actually going on etiologically, what it means for an identity to be true, _&c._ are strongly discouraged as hateful, hurtful, distressing, _&c._
+Thellim's horror at the fictional world of Jane Austen is basically how I feel about "trans" culture in the current year. It _actively discourages self-modeling!_ People who have cross-sex fantasies are encouraged to reify them into a gender identity which everyone else is supposed to unquestioningly accept. Obvious critical questions about what's actually going on etiologically, what it means for an identity to be true, _&c._ are strongly discouraged as hateful, hurtful, distressing, _&c._
  
-The problem is _not_ that I think there's anything wrong with having cross-sex fantasies, and wanting the fantasy to be real—just as Thellim's problem with _Pride and Prejudice_ is not there being anything wrong with wanting to marry a suitable bachelor. These are perfectly respectable goals.
+The problem is _not_ that I think there's anything wrong with having cross-sex fantasies, and wanting the fantasy to become real—just as Thellim's problem with _Pride and Prejudice_ is not there being anything wrong with wanting to marry a suitable bachelor. These are perfectly respectable goals.
  
-The _problem_ is that people who are trying to be people, people who are trying to acheive their goals _in reality_, do so in a way involves having concepts of their own minds, and trying to improve both their self-models and their selves—and that's _not possible_ in a culture that tries to ban, as heresy, the idea that it's possible for someone's self-model to be wrong.
+The _problem_ is that people who are trying to be people, people who are trying to acheive their goals _in reality_, do so in a way that involves having concepts of their own minds, and trying to improve both their self-models and their selves—and that's _not possible_ in a culture that tries to ban, as heresy, the idea that it's possible for someone's self-model to be wrong.
  
-A trans woman I follow on Twitter complained that a receptionist at her workplace said she looked like a male celebrity. "I'm so mad," she fumed. "I look like this right now"—there was a photo attached to the Tweet—"how could anyone ever think that was an okay thing to say?"
+A trans woman I follow on Twitter complained that a receptionist at her workplace said she looked like some male celebrity. "I'm so mad," she fumed. "I look like this right now"—there was a photo attached to the Tweet—"how could anyone ever think that was an okay thing to say?"
  
  It _is_ genuinely sad that the author of those Tweets didn't get perceived the way she would prefer! But the thing I want her to understand, a thing I think any sane adult should understand—
  
-_It was a compliment!_ That poor receptionist was almost certainly thinking of [David Bowie](https://en.wikipedia.org/wiki/David_Bowie) or [Eddie Izzard](https://en.wikipedia.org/wiki/Eddie_Izzard), rather than being hateful and trying to hurt.
+_It was a compliment!_ That receptionist was almost certainly thinking of [David Bowie](https://en.wikipedia.org/wiki/David_Bowie) or [Eddie Izzard](https://en.wikipedia.org/wiki/Eddie_Izzard), rather than being hateful and trying to hurt.
  
  The author should have graciously accepted the compliment, and _done something to pass better next time_. The horror of trans culture is that it's impossible to imagine any of these people doing that—of noticing that they're behaving like a TERF's hostile stereotype of a narcissistic, gaslighting trans-identified male and snapping out of it.
  
@@ -722,30 +759,111 @@ But I would have expected people with the barest inkling of self-awareness and h
  
  And if that's too much to expect of the general public—
  
-And it's too much to expect garden-variety "rationalists" to figure out on their own without prompting from their betters—
+And if it's too much to expect garden-variety "rationalists" to figure out on their own without prompting from their superiors—
+
+Then I would have at least expected Eliezer Yudkowsky to take actions _in favor of_ rather than _against_ his faithful students having these very basic capabilities for reflection, self-observation, and ... _speech_? I would have expected Eliezer Yudkowsky to not _actively exert optimization pressure in the direction of transforming me into a Jane Austen character_.
+
+This is the part where Yudkowsky or his flunkies accuse me of being uncharitable, of failing at perspective-taking. Obviously, Yudkowsky doesn't _think of himself_ as trying to transform his faithful students into Jane Austen characters. One might ask if it does not therefore follow that I have failed to understand his position? [As Yudkowsky put it](https://twitter.com/ESYudkowsky/status/1435618825198731270):
+
+> The Other's theory of themselves usually does not make them look terrible. And you will not have much luck just yelling at them about how they must really be doing `terrible_thing` instead.
+
+But the substance of my accusations is not about Yudkowsky's _conscious subjective narrative_. I don't have a lot of uncertainty about Yudkowsky's _theory of himself_, because he told us that, very clearly: "it is sometimes personally prudent and not community-harmful to post your agreement with Stalin about things you actually agree with Stalin about, in ways that exhibit generally rationalist principles, especially because people do _know_ they're living in a half-Stalinist environment." I don't doubt that that's [how the algorithm feels from the inside](https://www.lesswrong.com/posts/yA4gF5KrboK2m2Xu7/how-an-algorithm-feels-from-inside).
+
+But my complaint is about the work the algorithm is _doing_ in Stalin's service, not about how it _feels_; I'm talking about a pattern of _publicly visible behavior_ stretching over years. (Thus, "take actions" in favor of/against, rather than "be"; "exert optimization pressure in the direction of", rather than "try".) I agree that everyone has a story in which they don't look terrible, and that people mostly believe their own stories, but _it does not therefore follow_ that no one ever does anything terrible.
+
+I agree that you won't have much luck yelling at the Other about how they must really be doing `terrible_thing`. (People get very invested in their own stories.) But if you have the _receipts_ of the Other repeatedly doing `terrible_thing` in public over a period of years, maybe yelling about it to _everyone else_ might help _them_ stop getting suckered by the Other's fraudulent story.
+
+Let's recap.
+
+[TODO: recap—
+* in 2009, "Changing Emotions"
+* in 2016, "20% of the ones with penises"
+* ...
+]
+
+
+Yudkowsky writes:
+
+> In terms of important things? Those would be all the things I've read—from friends, from strangers on the Internet, above all from human beings who are people—describing reasons someone does not like to be tossed into a Male Bucket or Female Bucket, as it would be assigned by their birth certificate, or perhaps at all.
+>
+> And I'm not happy that the very language I use, would try to force me to take a position on that; not a complicated nuanced position, but a binarized position, _simply in order to talk grammatically about people at all_.
+
+What does the "tossed into a bucket" metaphor refer to, though? I can think of many different things that might be summarized that way, and my sympathy for the one who does not like to be tossed into a bucket depends on a lot on exactly what real-world situation is being mapped to the bucket.
+
+If we're talking about overt _gender role enforcement attempts_—things like, "You're a girl, therefore you need to learn to keep house for your future husband", or "You're a man, therefore you need to toughen up"—then indeed, I strongly support people who don't want to be tossed into that kind of bucket.
+
+(There are [historical reasons for the buckets to exist](/2020/Jan/book-review-the-origins-of-unfairness/), but I'm eager to bet on modern Society being rich enough and smart enough to either forgo the buckets, or at least let people opt-out of the default buckets, without causing too much trouble.)
  
-Then I would have at least expected Eliezer Yudkowsky to be _in favor of_ rather than _against_ his faithful students having these very basic capabilities for reflection, self-observation, and ... _speech_?
+But importantly, my support for people not wanting to be tossed into gender role buckets is predicated on their reasons for not wanting that _having genuine merit_—things like "The fact that I'm a juvenile female human doesn't mean I'll have a husband; I'm actually planning to become a nun", or "The sex difference in Big Five Neuroticism is only _d_ ≈ 0.5; your expectation that I be able to toughen up is not reasonable given the information you have about me in particular, even if most adult human males are tougher than me". I _don't_ think people have a _general_ right to prevent others from using sex categories to make inferences or decisions about them, _because that would be crazy_. If a doctor were to tell me, "As a male, you're at risk for prostate cancer," it would be _bonkers_ for me to reply that I don't like being tossed into a Male Bucket like that.
  
-I would have expected Eliezer Yudkowsky to not _actively exert optimization pressure in the direction of transforming me into a Jane Austen character_.
+While piously appealing to the feelings of people describing reasons they do not want to be tossed into a Male Bucket or a Female Bucket, Yudkowsky does not seem to be distinguishing between reasons that have merit, and reasons that do not have merit. The post continues (bolding mine):
  
+> In a wide variety of cases, sure, ["he" and "she"] can clearly communicate the unambiguous sex and gender of something that has an unambiguous sex and gender, much as a different language might have pronouns that sometimes clearly communicated hair color to the extent that hair color often fell into unambiguous clusters.
+>
+> But if somebody's hair color is halfway between two central points? If their civilization has developed stereotypes about hair color they're not comfortable with, such that they feel that the pronoun corresponding to their outward hair color is something they're not comfortable with because they don't fit key aspects of the rest of the stereotype and they feel strongly about that? If they have dyed their hair because of that, or **plan to get hair surgery, or would get hair surgery if it were safer but for now are afraid to do so?** Then it's stupid to try to force people to take complicated positions about those social topics _before they are allowed to utter grammatical sentences_.
+
+So, I agree that a language convention in which pronouns map to hair color doesn't seem great, and that the people in this world should probably coordinate on switching to a better convention, if they can figure out how.
+
+But taking as given the existence of a convention in which pronouns refer to hair color, a demand to be refered to as having a hair color _that one does not in fact have_ seems pretty outrageous to me!
+
+It makes sense to object to the convention forcing a binary choice in the "halfway between two central points" case. That's an example of _genuine_ nuance brought on by a _genuine_ challenge to a system that _falsely_ assumes discrete hair colors.
+
+But ... "plan to get hair surgery"? "Would get hair surgery if it were safer but for now are afraid to do so"? In what sense do these cases present a challenge to the discrete system and therefore call for complication and nuance? There's nothing ambiguous about these cases: if you haven't, in fact, changed your hair color, then your hair is, in fact, its original color. The decision to get hair surgery does not _propagate backwards in time_. The decision to get hair surgery cannot be _imported from a counterfactual universe in which it is safer_. People who, today, do not have the hair color that they would prefer, are, today, going to have to deal with that fact _as a fact_.
+
+Is the idea that we want to use the same pronouns for the same person over time, so that if we know someone is going to get hair surgery—they have an appointment with the hair surgeon at this-and-such date—we can go ahead and switch their pronouns in advance? Okay, I can buy that.
+
+But extending that to the "would get hair surgery if it were safer" case is _absurd_. No one treats _conditional plans assuming speculative future advances in medical technology_ the same as actual plans. I don't think this case calls for any complicated nuanced position, and I don't see why Eliezer Yudkowsky would suggest that it would, unless the real motive for insisting on complication and nuance is as an obfuscation tactic—unless, at some level, Eliezer Yudkowsky doesn't expect his followers to deal with facts?
+
+Maybe the problem is easier to see in the context of a non-gender example? [My previous hopeless ideological war—before this one—was against the conflation of _schooling_ and _education_](/2022/Apr/student-dysphoria-and-a-previous-lifes-war/): I hated being tossed into the Student Bucket, as it would be assigned by my school course transcript, or perhaps at all. But crucially, my tirades against the Student Bucket described reasons not just that _I didn't like it_, but reasons that the bucket was _actually wrong on the empirical merits_: people can and do learn important things by studying and practicing out of their own curiosity and ambition; the system was _actually in the wrong_ for assuming that nothing you do matters unless you do it on the command of a designated "teacher" while enrolled in a designated "course".
+
+And _because_ my war footing was founded on the empirical merits, I knew that I had to _update_ to the extent that the empirical merits showed that I was in the wrong. In 2010, I took a differential equations class "for fun" at the local community college, expecting to do well and thereby prove that my previous couple years of math self-study had been the equal of any schoolstudent's.
+
+In fact, I did very poorly and scraped by with a _C_. (Subjectively, I felt like I "understood the concepts", and kept getting surprised when that understanding somehow didn't convert into passing quiz scores.) That hurt. That hurt a lot.
+
+_It was supposed to hurt_. One could imagine a Jane Austen character in this situation doubling down on his antagonism to everything school-related, in order to protect himself from being hurt—to protest that the teacher hated him, that the quizzes were unfair, that the answer key must have had a printing error—in short, that he had been right in every detail all along, and that any suggestion otherwise was credentialist propaganda.
+
+I knew better than to behave like that—and to the extent that I was tempted, I retained my ability to notice and snap out of it. My failure _didn't_ mean I had been wrong about everything, that I should humbly resign myself to the Student Bucket forever and never dare to question it again—but it _did_ mean that I had been wrong about _something_. I could [update myself incrementally](https://www.lesswrong.com/posts/627DZcvme7nLDrbZu/update-yourself-incrementally)—but I _did_ need to update. (Perhaps, that "math" encompasses different subskills, and that my glorious self-study had unevenly trained some skills and not others: there was nothing contradictory about my [successfully generalizing one of the methods in the textbook to arbitrary numbers of variables](https://math.stackexchange.com/questions/15143/does-the-method-for-solving-exact-des-generalize-like-this), while _also_ [struggling with the class's assigned problem sets](https://math.stackexchange.com/questions/7984/automatizing-computational-skills).)
+
+Someone who uncritically validated my not liking to be tossed into the Student Bucket, instead of assessing my _reasons_ for not liking to be tossed into the Bucket and whether those reasons had merit, would be hurting me, not helping me—because in order to navigate the real world, I need a map that reflects the territory, rather than my narcissistic fantasies. I'm a better person for straightforwardly facing the shame of getting a _C_ in community college differential equations, rather than trying to deny it or run away from it or claim that it didn't mean anything. Part of updating myself incrementally was that I would get _other_ chances to prove that my autodidacticism could match the standard set by schools. (I've had a professional and open-source programming career without finishing college; when I audited honors analysis at UC Berkeley "for fun" in 2017, I did fine on the midterm; when applying for a new dayjob in 2018, the interviewer, noting my lack of a degree, said he was going to give a version of the interview without a computer science theory question. I insisted on being given the "college" version of the interview, solved a dynamic programming problem, and got the job. And so on.)
+
+If you can see why uncritically affirming people's current self-image isn't the right solution to "student dysphoria", it should be obvious why the same is true of gender dysphoria. The principle that _truth matters_ is very general!
+
+In an article titled ["Actually, I Was Just Crazy the Whole Time"](https://somenuanceplease.substack.com/p/actually-i-was-just-crazy-the-whole), detransitioner Michelle Alleva contrasts her beliefs at the time of deciding to transition, with her current beliefs. While transitioning, she accounted for many pieces of evidence about herself ("dislike attention as a female", "obsessive thinking about gender", "didn't fit in with the girls", _&c_.) in terms of the theory "It's because I'm trans." But now, Alleva writes, she thinks she has a variety of better explanations that, all together, cover everything on the original list: "It's because I'm autistic", "It's because I have unresolved trauma", "It's because women are often treated poorly" ... including "That wasn't entirely true" (!!).
+
+This is a _rationality_ skill. Alleva had a theory about herself, and then she _revised her theory upon further consideration of the evidence_. Beliefs about one's self aren't special and can updated using the _same_ methods that you would use for anything else—[just as a recursively self-improving AI would reason the same about transistors "inside" the AI and transitors in "the environment."](https://www.lesswrong.com/posts/TynBiYt6zg42StRbb/my-kind-of-reflection)
+
+[TODO: I'm praising the form of the inference; not the conclusion; homosexual transsexuals who update to "born in the wrong body" at least have a case; for people like me, and separately people like Alleva, it's just not true; if you coddle "Female Bucket" sentiments, you're outlawing updates]
+
+This also isn't a particularly _advanced_ rationality skill. This is very basic—something novices grasp during their early steps along the Way. 
+
+There was an exchange in the comment section between me and Yudkowsky back during the early days of _Less Wrong_, when I still hadn't grown out of [my teenage religion of psychological sex differences denialism](/2021/May/sexual-dimorphism-in-the-sequences-in-relation-to-my-gender-problems/#antisexism). Yudkowsky had claimed that he had ["never known a man with a true female side, and I have never known a woman with a true male side, either as authors or in real life."](https://www.lesswrong.com/posts/FBgozHEv7J72NCEPB/my-way/comment/K8YXbJEhyDwSusoY2) Offended at our leader's sexism (but sensing no socially acceptable way to express it), I timidly [asked him to elaborate](https://www.lesswrong.com/posts/FBgozHEv7J72NCEPB/my-way?commentId=AEZaakdcqySmKMJYj), and as part of [his response](https://www.greaterwrong.com/posts/FBgozHEv7J72NCEPB/my-way/comment/W4TAp4LuW3Ev6QWSF), he mentioned that he "sometimes wish[ed] that certain women would appreciate that being a man is at least as complicated and hard to grasp and a lifetime's work to integrate, as the corresponding fact of feminity [_sic_]."
+
+[I replied](https://www.lesswrong.com/posts/FBgozHEv7J72NCEPB/my-way/comment/7ZwECTPFTLBpytj7b) (bolding added):
  
+> I sometimes wish that certain men would appreciate that not all men are like them—**or at least, that not all men _want_ to be like them—that the fact of masculinity is [not _necessarily_ something to integrate](https://www.lesswrong.com/posts/vjmw8tW6wZAtNJMKo/which-parts-are-me).**
  
+_I knew_. Even then, _I knew_
  
-[TODO section: rats from the Scott Alexander era will protest that I'm being uncharitable—failure of perspective taking; but I'm not complaining about Yudkowsky's subjective experience; I'm talking about a very clear pattern of behavior that's gone on for _years_]
  
  
-[TODO: let's recap]
  
  
-[TODO: the important thing is not being put in a box
+[TODO section Feelings vs. Truth
+This is a conflict between Feelings and Truth, between Politics and Truth.
  
+Scott Alexander chose Feelings, but I can't really hold that against him, because Scott is very explicit about only acting in the capacity of some guy with a blog. You can tell that he never wanted to be a religious leader; it just happened to him on accident because he writes faster than everyone else. I like Scott. Scott is great. I feel bad that such a large fraction of my interactions with him over the years have taken such an adversarial tone.
  
-student dysphoria—I hated being put in the box as student; 
+Eliezer Yudkowsky ... did not _unambiguously_ choose Feelings. He's been very careful with his words to strategically mood-affiliate with the side of Feelings, without consciously saying anything that he knows to be unambiguously false.
  
- Scott Alexander chose feelings, but I don't hold that against him; self-aggrandizement]
  
  
  
+Eliezer Yudkowsky is _absolutely_ trying to be a religious leader.
+
+If Eliezer Yudkowsky can't _unambigously_ choose Truth over Feelings, _then Eliezer Yudkowsky is a fraud_. 
+
+]
+
  
  [TODO section stakes, cooperation
  
@@ -753,19 +871,21 @@ student dysphoria—I hated being put in the box as student;
  >
  > _Perhaps_, echoed the other part of himself, _but that is not what was actually happening._
  
+
+
+
  I could forgive him for taking a shit on d4 of my chessboard (["at least 20% of the ones with penises are actually women"](https://www.facebook.com/yudkowsky/posts/10154078468809228)). I could even forgive him for subsequently taking a shit on e4 of my chessboard (["you're not standing in defense of truth if you insist on a word [...]"](https://twitter.com/ESYudkowsky/status/1067198993485058048)) as long as he wiped most of the shit off afterwards (["you are being the bad guy if you try to shut down that conversation by saying that 'I can define the word "woman" any way I want'"](https://www.facebook.com/yudkowsky/posts/10158853851009228)), even though, really, I would have expected someone so smart to take a hint after the incident on d4.
  
-But if he's _then_ going to take a shit on c3 of my chessboard (["the simplest and best protocol is, '"He" refers to the set of people who have asked us to use "he" [...]'"](https://www.facebook.com/yudkowsky/posts/10159421750419228))
+But if he's _then_ going to take a shit on c3 of my chessboard (["the simplest and best protocol is, '"He" refers to the set of people who have asked us to use "he" [...]'"](https://www.facebook.com/yudkowsky/posts/10159421750419228)),
  
-the turd on c3 is a pretty big likelihood ratio 
  
-]
  
+The turd on c3 is a pretty big likelihood ratio!
  
  
-[TODO: in the context of elite Anglosphere culture in 2016–2022; it should be clear that defenders of reason need to be able to push back and assert that biological sex is real; other science communicators like 
+As the traditional rationalist saying goes: once is happenstance. Twice is coincidence. _Three times is enemy optimization_.
  
-[Dawkins can see it.](https://www.theguardian.com/books/2021/apr/20/richard-dawkins-loses-humanist-of-the-year-trans-comments) [Jerry Coyne can see it.](https://whyevolutionistrue.com/2018/12/11/once-again-why-sex-is-binary/)]
+]
  
  
  
@@ -774,7 +894,7 @@ the turd on c3 is a pretty big likelihood ratio
  https://twitter.com/ESYudkowsky/status/1404697716689489921
  > I have never in my own life tried to persuade anyone to go trans (or not go trans)—I don't imagine myself to understand others that much.
  
-If you think it "sometimes personally prudent and not community-harmful" to strategically say positive things about Republican candidates, and make sure to never, ever say negative things about Democratic candidates (because you "don't see what the alternative is besides getting shot"), you can see why people might regard you as a _Republican shill_—even if all the things you said were true, and even if you never told any specific individual, "You should vote Republican."
+If you think it "sometimes personally prudent and not community-harmful" to strategically say positive things about Republican candidates, and make sure to never, ever say positive things about Democratic candidates (because you "don't see what the alternative is besides getting shot"), you can see why people might regard you as a _Republican shill_—even if all the things you said were true, and even if you never told any specific individual, "You should vote Republican."
  
  https://www.facebook.com/yudkowsky/posts/10154110278349228
  > Just checked my filtered messages on Facebook and saw, "Your post last night was kind of the final thing I needed to realize that I'm a girl."
@@ -800,21 +920,32 @@ In the context of AI alignment theory, Yudkowsky has written about a "nearest un
  
  Suppose you developed an AI to [maximize human happiness subject to the constraint of obeying explicit orders](https://arbital.greaterwrong.com/p/nearest_unblocked#exampleproducinghappiness). It might first try administering heroin to humans. When you order it not to, it might switch to administering cocaine. When you order it to not use any of a whole list of banned happiness-producing drugs, it might switch to researching new drugs, or just _pay_ humans to take heroin, _&c._
  
-It's the same thing with Yudkowsky's political-risk minimization subject to the constraint of not saying anything he knows to be false. First he comes out with ["I think I'm over 50% probability at this point that at least 20% of the ones with penises are actually women"](https://www.facebook.com/yudkowsky/posts/10154078468809228) (March 2016). When you point out that [that's not true](https://www.lesswrong.com/posts/QZs4vkC7cbyjL9XA9/changing-emotions), then the next time he revisits the subject, he switches to ["you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning"](https://archive.is/Iy8Lq) (November 2018). When you point out that [_that's_ not true either](https://www.lesswrong.com/posts/FaJaCgqBKphrDzDSj/37-ways-that-words-can-be-wrong), he switches to "It is Shenanigans to try to bake your stance on how clustered things are [...] _into the pronoun system of a language and interpretation convention that you insist everybody use_" (February 2021). When you point out that's not what's going on, he switches to ... I don't know, but he's a smart guy; in the unlikely event that he sees fit to respond to this post, I'm sure he'll be able to think of _something_—but at this point, I have no reason to care. Talking to Yudkowsky on topics where getting the right answer would involve acknowledging facts that would make you unpopular in Berkeley is a _waste of everyone's time_; trying to inform you isn't [his bottom line](https://www.lesswrong.com/posts/34XxbRFe54FycoCDw/the-bottom-line).
+It's the same thing with Yudkowsky's political-risk minimization subject to the constraint of not saying anything he knows to be false. First he comes out with ["I think I'm over 50% probability at this point that at least 20% of the ones with penises are actually women"](https://www.facebook.com/yudkowsky/posts/10154078468809228) (March 2016). When you point out that [that's not true](https://www.lesswrong.com/posts/QZs4vkC7cbyjL9XA9/changing-emotions), then the next time he revisits the subject, he switches to ["you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning"](https://archive.is/Iy8Lq) (November 2018). When you point out that [_that's_ not true either](https://www.lesswrong.com/posts/FaJaCgqBKphrDzDSj/37-ways-that-words-can-be-wrong), he switches to "It is Shenanigans to try to bake your stance on how clustered things are [...] _into the pronoun system of a language and interpretation convention that you insist everybody use_" (February 2021). When you point out [that's not what's going on](/2022/Mar/challenges-to-yudkowskys-pronoun-reform-proposal/), he switches to ... I don't know, but he's a smart guy; in the unlikely event that he sees fit to respond to this post, I'm sure he'll be able to think of _something_—but at this point, _I have no reason to care_. Talking to Yudkowsky on topics where getting the right answer would involve acknowledging facts that would make you unpopular in Berkeley is a _waste of everyone's time_; trying to inform you isn't [his bottom line](https://www.lesswrong.com/posts/34XxbRFe54FycoCDw/the-bottom-line).
  
  Accusing one's interlocutor of bad faith is frowned upon for a reason. We would prefer to live in a world where we have intellectually fruitful object-level discussions under the assumption of good faith, rather than risk our fora degenerating into an acrimonious brawl of accusations and name-calling, which is unpleasant and (more importantly) doesn't make any intellectual progress. I, too, would prefer to have a real object-level discussion under the assumption of good faith.
  
-I tried the object-level good-faith argument thing _first_. I tried it for _years_. But at some point, I think I should be _allowed to notice_ the nearest-unblocked-strategy game which is _very obviously happening_ if you look at the history of what was said. I think there's _some_ number of years and _some_ number of thousands of words of litigating the object-level _and_ the meta level after which there's nothing left for me to do but jump up to the meta-meta level and explain, to anyone capable of hearing it, why in this case I think I've accumulated enough evidence in this case for the assumption of good faith to have been _empirically falsified_.
+Accordingly, I tried the object-level good-faith argument thing _first_. I tried it for _years_. But at some point, I think I should be _allowed to notice_ the nearest-unblocked-strategy game which is _very obviously happening_ if you look at the history of what was said. I think there's _some_ number of years and _some_ number of thousands of words of litigating the object-level _and_ the meta level after which there's nothing left for me to do but jump up to the meta-meta level and explain, to anyone capable of hearing it, why in this case I think I've accumulated enough evidence for the assumption of good faith to have been _empirically falsified_.
  
-(Of course, I realize that if we're crossing the Rubicon of abandoning the norm of assuming good faith, it needs to be abandoned symmetrically. I _think_ I'm doing a _pretty good_ job of adhering to standards of intellectual conduct and being transparent about my motivations, but I'm definitely not perfect, and, unlike Yudkowsky, I'm not so absurdly miscalibratedly arrogant to claim "confidence in my own ability to independently invent everything important" (!) about my topics of interest. If Yudkowsky or anyone else thinks they _have a case_ based on my behavior that _I'm_ being culpably intellectually dishonest, they of course have my blessing and encouragement to post it for the audience to evaluate.)
+(Obviously, if we're crossing the Rubicon of abandoning the norm of assuming good faith, it needs to be abandoned symmetrically. I _think_ I'm doing a _pretty good_ job of adhering to standards of intellectual conduct and being transparent about my motivations, but I'm definitely not perfect, and, unlike Yudkowsky, I'm not so absurdly miscalibratedly arrogant to claim "confidence in my own ability to independently invent everything important" (!) about my topics of interest. If Yudkowsky or anyone else thinks they _have a case_ based on my behavior that _I'm_ being culpably intellectually dishonest, they of course have my blessing and encouragement to post it for the audience to evaluate.)
  
-What makes all of this especially galling is the fact that _all of my heretical opinions are literally just Yudkowsky's opinions from the 'aughts!_ My whole thing about how changing sex isn't possible with existing technology because the category encompasses so many high-dimensional details? Not original to me! I [filled in a few trivial technical details](/2021/May/sexual-dimorphism-in-the-sequences-in-relation-to-my-gender-problems/#changing-sex-is-hard), but again, this was _in the Sequences_ as ["Changing Emotions"](https://www.lesswrong.com/posts/QZs4vkC7cbyjL9XA9/changing-emotions). My thing about how you can't define concepts any way you want, because there are mathematical laws governing which category boundaries compress your anticipated experiences? Not original to me! I [filled in](https://www.lesswrong.com/posts/esRZaPXSHgWzyB2NL/where-to-draw-the-boundaries) [a few technical details](https://www.lesswrong.com/posts/onwgTH6n8wxRSo2BJ/unnatural-categories-are-optimized-for-deception), but [_we had a whole Sequence about this._](https://www.lesswrong.com/posts/FaJaCgqBKphrDzDSj/37-ways-that-words-can-be-wrong)
+**What makes all of this especially galling is the fact that _all of my heretical opinions are literally just Yudkowsky's opinions from the 'aughts!_** My whole thing about how changing sex isn't possible with existing technology because the category encompasses so many high-dimensional details? Not original to me! I [filled in a few technical details](/2021/May/sexual-dimorphism-in-the-sequences-in-relation-to-my-gender-problems/#changing-sex-is-hard), but again, this was _in the Sequences_ as ["Changing Emotions"](https://www.lesswrong.com/posts/QZs4vkC7cbyjL9XA9/changing-emotions). My thing about how you can't define concepts any way you want because there are mathematical laws governing which category boundaries compress your anticipated experiences? Not original to me! I [filled in](https://www.lesswrong.com/posts/esRZaPXSHgWzyB2NL/where-to-draw-the-boundaries) [a few technical details](https://www.lesswrong.com/posts/onwgTH6n8wxRSo2BJ/unnatural-categories-are-optimized-for-deception), but [_we had a whole Sequence about this._](https://www.lesswrong.com/posts/FaJaCgqBKphrDzDSj/37-ways-that-words-can-be-wrong)
  
-Seriously, you think I'm _smart enough_ to come up with all of this indepedently? I'm not! I ripped it all off from Yudkowsky back in the 'aughts _when he still gave a shit about telling the truth_ in this domain. (More precisely, when he thought he could afford to give a shit, before the political environment and the growing stature of his so-called "rationalist" movement changed his incentives.)
+Seriously, you think I'm _smart enough_ to come up with all of this indepedently? I'm not! I ripped it all off from Yudkowsky back in the 'aughts _when he still gave a shit about telling the truth_. (Actively telling the truth, and not just technically not lying.)
  
  Does ... does he expect us not to _notice_? Or does he think that "everybody knows"?
  
-[TODO: the dolphin war, our thoughts about dolphins are literally downstream from Scott's political incentives in 2014; this is a sign that we're a cult]
+But I don't, think that everybody knows. And I am not, giving up that easily. Not on an entire subculture full of people.
+
+
+
+[TODO: the dolphin war, our thoughts about dolphins are literally downstream from Scott's political incentives in 2014; this is a sign that we're a cult
+
+https://twitter.com/ESYudkowsky/status/1404700330927923206
+> That is: there's a story here where not just particular people hounding Zack as a responsive target, but a whole larger group, are engaged in a dark conspiracy that is all about doing damage on issues legible to Zack and important to Zack.  This is merely implausible on priors.
+
+I mean, I wouldn't _call_ it a "dark conspiracy" exactly, but if the people with intellectual authority are computing what to say on the principle of "it is sometimes personally prudent and not community-harmful to post [their] agreement with Stalin", and Stalin cares a lot about doing damage on issues legible and important to me, then, pragmatically, I think that has _similar effects_ on the state of our collective knowledge as a dark conspiracy, even if the mechanism of coordination is each individual being separately terrified of Stalin, rather than them meeting with dark robes to plot under a full moon.
+
+]
  
  [TODO: sneering at post-rats; David Xu interprets criticism of Eliezer as me going "full post-rat"?! 
  
@@ -824,17 +955,17 @@ https://twitter.com/davidxu90/status/1435106339550740482
  ]
  
  
-David Xu writes (with Yudkowsky ["endors[ing] everything [he] just said"](https://twitter.com/ESYudkowsky/status/1436025983522381827)):
+David Xu writes (with Yudkowsky ["endors[ing] everything [Xu] just said"](https://twitter.com/ESYudkowsky/status/1436025983522381827)):
  
-> I'm curious what might count for you as a crux about this; candidate cruxes I could imagine include: whether some categories facilitate inferences that _do_, on the whole, cause more harm than benefit, and if so, whether it is "rational" to rule that such inferences should be avoided when possible, and if so, whether the best way to disallow a large set of potential inferences is the proscribe the use of the categories that facilitate them—and if _not_, whether proscribing the use of a category in _public communication_ constitutes "proscribing" it more generally, in a way that interferes with one's ability to perform "rational" thinking in the privacy of one's own mind.
+> I'm curious what might count for you as a crux about this; candidate cruxes I could imagine include: whether some categories facilitate inferences that _do_, on the whole, cause more harm than benefit, and if so, whether it is "rational" to rule that such inferences should be avoided when possible, and if so, whether the best way to disallow a large set of potential inferences is [to] proscribe the use of the categories that facilitate them—and if _not_, whether proscribing the use of a category in _public communication_ constitutes "proscribing" it more generally, in a way that interferes with one's ability to perform "rational" thinking in the privacy of one's own mind.
  >
  > That's four possible (serial) cruxes I listed, one corresponding to each "whether". 
  
-On the first and second cruxes, concerning whether some categories facilitate inferences that cause more harm than benefit on the whole and whether they should be avoided when possible, I ask: harm _to whom?_ Not all agents have the same utility function! If some people are harmed by other people making certain probabilistic inferences, then it would seem that there's a _conflict_ between the people harmed (who prefer that such inferences be avoided if possible), and people who want to make and share probabilistic inferences about reality (who think that that which can be destroyed by the truth, should be).
+I reply: on the first and second cruxes, concerning whether some categories facilitate inferences that cause more harm than benefit on the whole and whether they should be avoided when possible, I ask: harm _to whom?_ Not all agents have the same utility function! If some people are harmed by other people making certain probabilistic inferences, then it would seem that there's a _conflict_ between the people harmed (who prefer that such inferences be avoided if possible), and people who want to make and share probabilistic inferences about reality (who think that that which can be destroyed by the truth, should be).
  
  On the third crux, whether the best way to disallow a large set of potential inferences is to proscribe the use of the categories that facilitate them: well, it's hard to be sure whether it's the _best_ way: no doubt a more powerful intelligence could search over a larger space of possible strategies than me. But yeah, if your goal is to _prevent people from noticing facts about reality_, then preventing them from using words that refer those facts seems like a pretty effective way to do it!
  
-On the fourth crux, whether proscribing the use of a category in public communication constitutes "proscribing" in a way that interferes with one's ability to think in the privacy of one's own mind: I think this is true (for humans). We're social animals. To the extent that we can do higher-grade cognition at all, we do it (even when alone) using our language faculties that are designed for communicating with others. How are you supposed to think about things that you don't have words for?
+On the fourth crux, whether proscribing the use of a category in public communication constitutes "proscribing" in a way that interferes with one's ability to think in the privacy of one's own mind: I think this is mostly true for humans. We're social animals. To the extent that we can do higher-grade cognition at all, we do it using our language faculties that are designed for communicating with others. How are you supposed to think about things that you don't have words for?
  
  Xu continues:
  
@@ -846,26 +977,25 @@ Xu continues:
  >
  > This is the sense in which I suspect you are coming across as failing to properly Other-model.
  
-I reply: I'd like to [taboo](https://www.lesswrong.com/posts/WBdvyyHLdxZSAMmoz/taboo-your-words) the word "rational"; I think I can do a much better job of explaining what's going on without appealing to what is or is not "rational." (As it is written of a virtue which is nameless, if you speak overmuch of the Way, you will not attain it.)
-
-Thus, bearing in mind that we don't all need to count harms and benefits the same way, and that it is futile to contest what kind of prescriptions "rational" thinking entails, on the question of whether the dividing line between my behavior and the Caliphate's is caused by a disagreement as to whether "rational" thinking is "worth it", I'm inclined to say—
+After everything I've been through, I'm inclined to think it's not a "disagreement" at all.
  
-It's not a "disagreement" at all. It's a _conflict_.
+It's a _conflict_. I think what's actually at issue is that, at least in this domain, I want people to tell the truth, and the Caliphate wants people to not tell the truth. This isn't a disagreement about rationality, because telling the truth _isn't_ rational _if you don't want people to know things_.
  
+At this point, I imagine defenders of the Caliphate are shaking their heads in disappointment at how I'm doubling down on refusing to Other-model. But—_am_ I? Isn't this just a re-statement of Xu's first proposed crux, except reframed as a "values difference" rather than a "disagreement"? Is the problem that my use of the phrase "tell the truth" (which has positive valence in our culture) functions to sneak in normative connotations favoring "my side"?
  
-Telling the truth _isn't_ rational _if you don't want people to know things_.
+Fine. Objection sustained. I'm happy to use to Xu's language. I think what's actually at issue is that, at least in this domain, I want to facilitate people making inferences (full stop), and the Caliphate wants to _not_ facilitate people making inferences that, on the whole, cause more harm than benefit. This isn't a disagreement about rationality, because facilitating inferences _isn't_ rational _if you don't want people to make inferences_.
  
+[TODO: quote "Doublethink (Choosing to Be Biased)", note that despite Yudkowsky's doubt, the situation is actually worse that Orwell depicted—you don't even have to burn the offending material, if you can just get people to ignore it]
  
-I have a _seflish_ interest in people making and sharing accurate probabilistic inferences about how sex and gender and transgenderedness work in reality, for many reasons, but in part because _I need the correct answer in order to decide whether or not to cut my dick off_.
  
  [TODO:
  "massive psychological damage to some subset of people", 
-that's _not my problem_. I _don't give a shit_.
-
-Berkeley people may say that I'm doubling-down on failing to Other-model, but I don't think so; it's more honest to notice the conflict and analyze the conflict, than to pretend that we all want the same thing; I can empathize with "playing on a different chessboard", and I would be more inclined to cooperate with it if it weren't accompanied by sneering about how he and his flunkies are the only sane and good people in the world]
+that's _not my problem_. I _don't give a shit_.]
  
  [TODO: if he's reading this, win back respect— reply, motherfucker]
  
  [TODO: the Death With Dignity era]
  
-[TODO: regrets]
+I don't, actually, know how to prevent the world from ending. Probably we were never going to survive. (The cis-human era of Earth-originating intelligent life wasn't going to last forever, and it's hard to exert detailed control over what comes next.) But if we're going to die either way, I think it would be _more dignified_ if Eliezer Yudkowsky were to behave as if he wanted his faithful students to be informed. Since it doesn't look like we're going to get that, I think it's _more dignified_ if his faithful students _know_ that he's not behaving like he wants us to be informed. And so one of my goals in telling you this long story about how I spent (wasted?) the last six years of my life, is to communicate the moral that **I don't trust Eliezer Yudkowsky to tell the truth, and I don't think you should trust him, either**—and that this is a _problem_ for the future of humanity, to the extent that there is a future of humanity.
+
+Is that a mean thing to say about someone to whom I owe so much? Probably. But if it helps—he didn't create me to not say mean things. As far as _I_ can tell, I'm only doing what he taught me to do in 2007–9: [carve reality at the joints](https://www.lesswrong.com/posts/esRZaPXSHgWzyB2NL/where-to-draw-the-boundaries), [speak the truth even if your voice trembles](https://www.lesswrong.com/posts/pZSpbxPrftSndTdSf/honesty-beyond-internal-truth), and [make an extraordinary effort](https://www.lesswrong.com/posts/GuEsfTpSDSbXFiseH/make-an-extraordinary-effort) when you've got [Something to Protect](https://www.lesswrong.com/posts/SGR4GxFK7KmW7ckCB/something-to-protect).