If Clarity Seems Like Death to Them

"—but if one hundred thousand [normies] can turn up, to show their support for the [rationalist] community, why can't you?"

I said wearily, "Because every time I hear the word community, I know I'm being manipulated. If there is such a thing as the [rationalist] community, I'm certainly not a part of it. As it happens, I don't want to spend my life watching [rationalist and effective altruist] television channels, using [rationalist and effective altruist] news systems ... or going to [rationalist and effective altruist] street parades. It's all so ... proprietary. You'd think there was a multinational corporation who had the franchise rights on [truth and goodness]. And if you don't market the product their way, you're some kind of second-class, inferior, bootleg, unauthorized [nerd]."

—"Cocoon" by Greg Egan (paraphrased)1

Recapping my Whole Dumb Story so far: in a previous post, "Sexual Dimorphism in Yudkowsky's Sequences, in Relation to My Gender Problems", I told you about how I've always (since puberty) had this obsessive erotic fantasy about being magically transformed into a woman and how I used to think it was immoral to believe in psychological sex differences, until I read these great Sequences of blog posts by Eliezer Yudkowsky which incidentally pointed out how absurdly impossible my obsessive fantasy was ...

—none of which gooey private psychological minutiæ would be in the public interest to blog about, except that, as I explained in a subsequent post, "Blanchard's Dangerous Idea and the Plight of the Lucid Crossdreamer", around 2016, everyone in the community that formed around the Sequences suddenly decided that guys like me might actually be women in some unspecified metaphysical sense, and the cognitive dissonance from having to rebut all this nonsense coming from everyone I used to trust drove me temporarily insane from stress and sleep deprivation ...

—which would have been the end of the story, except that, as I explained in a subsequent–subsequent post, "A Hill of Validity in Defense of Meaning", in late 2018, Eliezer Yudkowsky prevaricated about his own philosophy of language in a way that suggested that people were philosophically confused if they disputed that men could be women in some unspecified metaphysical sense.

Anyone else being wrong on the internet like that wouldn't have seemed like a big deal, but Scott Alexander had semi-jokingly written that rationalism is the belief that Eliezer Yudkowsky is the rightful caliph. After extensive attempts by me and allies to get clarification from Yudkowsky amounted to nothing, we felt justified in concluding that he and his Caliphate of so-called "rationalists" was corrupt.

Origins of the Rationalist Civil War (April–May 2019)

Anyway, given that the "rationalists" were fake and that we needed something better, there remained the question of what to do about that, and how to relate to the old thing.

I had been hyperfocused on prosecuting my Category War, but the reason Michael Vassar and Ben Hoffman and Jessica Taylor2 were willing to help me out was not because they particularly cared about the gender and categories example but because it seemed like a manifestation of a more general problem of epistemic rot in "the community."

Ben had previously worked at GiveWell and had written a lot about problems with the Effective Altruism (EA) movement; in particular, he argued that EA-branded institutions were making incoherent decisions under the influence of incentives to distort information in order to seek power.

Jessica had previously worked at MIRI, where she was unnerved by what she saw as under-evidenced paranoia about information hazards and short AI timelines. (As Jack Gallagher, who was also at MIRI at the time, later put it, "A bunch of people we respected and worked with had decided the world was going to end, very soon, uncomfortably soon, and they were making it extremely difficult for us to check their work.")

To what extent were my gender and categories thing, and Ben's EA thing, and Jessica's MIRI thing, manifestations of the same underlying problem? Or had we all become disaffected with the mainstream "rationalists" for our own idiosyncratic reasons, and merely randomly fallen into each other's, and Michael's, orbit?

If there was a real problem, I didn't have a good grasp on it. Cultural critique is a fraught endeavor: if someone tells an outright lie, you can, maybe, with a lot of effort, prove that to other people and get a correction on that specific point. (Although as we had just discovered, even that might be too much to hope for.) But culture is the sum of lots and lots of little micro-actions by lots and lots of people. If your entire culture has visibly departed from the Way that was taught to you in the late 'aughts, how do you demonstrate that to people who are acting like they don't remember the old Way, or that they don't think anything has changed, or that they notice some changes but think the new way is better? It's not as simple as shouting, "Hey guys, Truth matters!" Any ideologue or religious person would agree with that. It's not feasible to litigate every petty epistemic crime in something someone said, and if you tried, someone who thought the culture was basically on track could accuse you of cherry-picking. If "culture" is a real thing at all—and it certainly seems to be—we are condemned to grasp it unclearly, relying on the brain's pattern-matching faculties to sum over thousands of little micro-actions as a gestalt.

Ben called the gestalt he saw the Blight, after the rogue superintelligence in Vernor Vinge's A Fire Upon the Deep. The problem wasn't that people were getting dumber; it was that they were increasingly behaving in a way that was better explained by their political incentives than by coherent beliefs about the world; they were using and construing facts as moves in a power game, albeit sometimes subject to genre constraints under which only true facts were admissible moves in the game.

When I asked Ben for specific examples of MIRI or CfAR leaders behaving badly, he gave the example of MIRI executive director Nate Soares posting that he was "excited to see OpenAI joining the space", despite the fact that no one who had been following the AI risk discourse thought that OpenAI as originally announced was a good idea. Nate had privately clarified that the word "excited" wasn't necessarily meant positively—and in this case meant something more like "terrified."

This seemed to me like the sort of thing where a particularly principled (naïve?) person might say, "That's lying for political reasons! That's contrary to the moral law!" and most ordinary grown-ups would say, "Why are you so upset about this? That sort of strategic phrasing in press releases is just how the world works."

I thought explaining the Blight to an ordinary grown-up was going to need either lots of specific examples that were more egregious than this (and more egregious than the examples in Sarah Constantin's "EA Has a Lying Problem" or Ben's "Effective Altruism Is Self-Recommending"), or somehow convincing the ordinary grown-up why "just how the world works" isn't good enough, and why we needed one goddamned place in the entire goddamned world with unusually high standards.

The schism introduced new pressures on my social life. I told Michael that I still wanted to be friends with people on both sides of the factional schism. Michael said that we should unambiguously regard Yudkowsky and CfAR president (and my personal friend of ten years) Anna Salamon as criminals or enemy combatants who could claim no rights in regard to me or him.

I don't think I got the framing at this time. War metaphors sounded scary and mean: I didn't want to shoot my friends! But the point of the analogy (which Michael explained, but I wasn't ready to hear until I did a few more weeks of emotional processing) was specifically that soldiers on the other side of a war aren't necessarily morally blameworthy as individuals:3 their actions are being directed by the Power they're embedded in.

I wrote to Anna (Subject: "Re: the end of the Category War (we lost?!?!?!)"):

I was just trying to publicly settle a very straightforward philosophy thing that seemed really solid to me

if, in the process, I accidentally ended up being an unusually useful pawn in Michael Vassar's deranged four-dimensional hyperchess political scheming

that's ... arguably not my fault


I may have subconsciously pulled off an interesting political maneuver. In my final email to Yudkowsky on 20 April 2019 (Subject: "closing thoughts from me"), I had written—

If we can't even get a public consensus from our de facto leadership on something so basic as "concepts need to carve reality at the joints in order to make probabilistic predictions about reality", then, in my view, there's no point in pretending to have a rationalist community, and I need to leave and go find something else to do (perhaps whatever Michael's newest scheme turns out to be). I don't think I'm setting my price for joining particularly high here?4

And as it happened, on 4 May 2019, Yudkowsky retweeted Colin Wright on the "univariate fallacy"—the point that group differences aren't a matter of any single variable—which was thematically similar to the clarification I had been asking for. (Empirically, it made me feel less aggrieved.) Was I wrong to interpret this as another "concession" to me? (Again, notwithstanding that the whole mindset of extracting "concessions" was corrupt and not what our posse was trying to do.)

Separately, one evening in April, I visited the house where "Meredith" and her husband Mike and Kelsey Piper and some other people lived, which I'll call "Arcadia".5 I said, essentially, "Oh man oh jeez, Ben and Michael want me to join in a rationalist civil war against the corrupt mainstream-rationality establishment, and I'd really rather not, and I don't like how they keep using scary hyperbolic words like 'cult' and 'war' and 'criminal', but on the other hand, they're the only ones backing me up on this incredibly basic philosophy thing and I don't feel like I have anywhere else to go." This culminated in a group conversation with the entire house, which I found unsettling. (Unfortunately, I didn't take notes and don't remember the details except that I had a sense of everyone else seeming to agree on things that I thought were clearly contrary to the spirit of the Sequences.)

The two-year-old son of Mike and "Meredith" was reportedly saying the next day that Kelsey doesn't like his daddy, which was confusing until it was figured out he had heard Kelsey talking about why she doesn't like Michael Vassar.6

And as it happened, on 7 May 2019, Kelsey wrote a Facebook comment displaying evidence of understanding my thesis.

These two datapoints led me to a psychological hypothesis: when people see someone wavering between their coalition and a rival coalition, they're intuitively motivated to offer a few concessions to keep the wavering person on their side. Kelsey could afford to speak as if she didn't understand the thing about sex being a natural category when it was just me freaking out alone, but visibly got it almost as soon as I could credibly threaten to walk (defect to a coalition of people she dislikes). Maybe my "closing thoughts" email had a similar effect on Yudkowsky, assuming he otherwise wouldn't have spontaneously tweeted something about the univariate fallacy two weeks later? This probably wouldn't work if you repeated it, or tried to do it consciously?

Exit Wounds (May 2019)

I started drafting a "why I've been upset for five months and have lost faith in the so-called 'rationalist' community" memoir-post. Ben said that the target audience to aim for was sympathetic but naïve people like I had been a few years ago, who hadn't yet had the experiences I'd had. This way, they wouldn't have to freak out to the point of being imprisoned and demand help from community leaders and not get it; they could just learn from me.

I didn't know how to continue it. I was too psychologically constrained; I didn't know how to tell the Whole Dumb Story without escalating personal conflicts or leaking info from private conversations.

I decided to take a break from the religious civil war and from this blog. I declared May 2019 as Math and Wellness Month.

My dayjob performance had been suffering for months. The psychology of the workplace is ... subtle. There's a phenomenon where some people are vastly more productive than others and everyone knows it, but no one is cruel enough to make it common knowledge. This is awkward for people who simultaneously benefit from the culture of common-knowledge-prevention allowing them to collect the status and money rents of being a $150K/year software engineer without actually performing at that level, who also read enough Ayn Rand as a teenager to be ideologically opposed to subsisting on unjustly-acquired rents rather than value creation. I didn't think the company would fire me, but I was worried that they should.

I asked my boss to temporarily assign me some easier tasks that I could make steady progress on. (We had a lot of LaTeX templating of insurance policy amendments that needed to get done.) If I was going to be psychologically impaired, it was better to be up-front about how I could best serve the company given that impairment, rather than hoping the boss wouldn't notice.

My intent of a break from the religious war didn't take. I met with Anna on the UC Berkeley campus and read her excerpts from Ben's and Jessica's emails. (She had not provided a comment on "Where to Draw the Boundaries?" despite my requests, including in the form of two paper postcards that I stayed up until 2 a.m. on 14 April 2019 writing; spamming people with hysterical and somewhat demanding postcards felt more distinctive than spamming people with hysterical and somewhat demanding emails.)

I complained that I had believed our own marketing material about the "rationalists" remaking the world by wielding a hidden Bayesian structure of Science and Reason that applies outside the laboratory. Was that all a lie? Were we not trying to do the thing anymore? Anna was dismissive: she thought that the idea I had gotten about "the thing" was never actually part of the original vision. She kept repeating that she had tried to warn me, and I didn't listen. (Back in the late 'aughts, she had often recommended Paul Graham's essay "What You Can't Say" to people, summarizing Graham's moral that you should figure out the things you can't say in your culture and then not say them, in order to avoid getting drawn into pointless conflicts.)

It was true that she had tried to warn me for years, and (not yet having gotten over my teenage ideological fever dream), I hadn't known how to listen. But this seemed fundamentally unresponsive to how I kept repeating that I only expected consensus on the basic philosophy of language and categorization (not my object-level special interest in sex and gender). Why was it so unrealistic to imagine that the smart people could enforce standards in our own tiny little bubble?

My frustration bubbled out into follow-up emails:

I'm also still pretty angry about how your response to my "I believed our own propaganda" complaint is (my possibly-unfair paraphrase) "what you call 'propaganda' was all in your head; we were never actually going to do the unrestricted truthseeking thing when it was politically inconvenient." But ... no! I didn't just make up the propaganda! The hyperlinks still work! I didn't imagine them! They were real! You can still click on them: "A Sense That More Is Possible", "Raising the Sanity Waterline"

I added:

Can you please acknowledge that I didn't just make this up? Happy to pay you $200 for a reply to this email within the next 72 hours

Anna said she didn't want to receive cheerful price offers from me anymore; previously, she had regarded my occasionally throwing money at her to bid for her scarce attention7 as good-faith libertarianism between consenting adults, but now she was afraid that if she accepted, it would be portrayed in some future Ben Hoffman essay as an instance of her using me. She agreed that someone could have gotten the ideals I had gotten out of those posts, but there was also evidence from that time pointing the other way (e.g., "Politics Is the Mind-Killer") and it shouldn't be surprising if people steered clear of controversy.

I replied: but when forming the original let's-be-apolitical vision in 2008, we did not anticipate that whether I should cut my dick off would become a political issue. That was new evidence about whether the original vision was wise! I wasn't particularly trying to do politics with my idiosyncratic special interest; I was trying to think seriously about the most important thing in my life and only do the minimum amount of politics necessary to protect my ability to think. If 2019-era "rationalists" were going to commit an epistemology mistake that interfered with my ability to think seriously about the most important thing in my life, and they couldn't correct the mistake even after it was pointed out, then the "rationalists" were worse than useless to me. This probably didn't matter causally (I wasn't an AI researcher, therefore I didn't matter), but it might matter timelessly (if I were part of a reference class that included AI researchers).

Fundamentally, I was skeptical that you could do consistently high-grade reasoning as a group without committing heresy, because of the mechanism that Yudkowsky had described in "Entangled Truths, Contagious Lies" and "Dark Side Epistemology": the need to lie about lying and cover up cover-ups propagates recursively. Anna was unusually skillful at thinking things without saying them; I thought people facing similar speech restrictions generally just get worse at thinking (plausibly8 including Yudkowsky), and the problem gets worse as the group effort scales. (It's less risky to recommend "What You Can't Say" to your housemates than to put it on your 501(c)(3) organization's canonical reading list.) You can't optimize your group's culture for not talking about atheism without also optimizing against understanding Occam's razor; you can't optimize for not questioning gender self-identity without also optimizing against understanding the 37 ways that words can be wrong.

Squabbling On and With lesswrong.com (May–July 2019)

Despite Math and Wellness Month and my intent to take a break from the religious civil war, I kept reading Less Wrong during May 2019, and ended up scoring a couple of victories in the civil war (at some cost to Wellness).

MIRI researcher Scott Garrabrant wrote a post about how "Yes Requires the Possibility of No". Information-theoretically, a signal sent with probability one transmits no information: you can only learn something from hearing a "Yes" if you believed that the answer could have been "No". I saw an analogy to my philosophy-of-language thesis, and mentioned it in a comment: if you want to believe that x belongs to category C, you might try redefining C in order to make the question "Is x a C?" come out "Yes", but you can only do so at the expense of making C less useful. Meaningful category-membership (Yes) requires the possibility of non-membership (No).

Someone objected that she found it "unpleasant that [I] always bring [my] hobbyhorse in, but in an 'abstract' way that doesn't allow discussing the actual object level question"; it made her feel "attacked in a way that allow[ed] for no legal recourse to defend [herself]." I replied that that was understandable, but that I found it unpleasant that our standard Bayesian philosophy of language somehow got politicized, such that my attempts to do correct epistemology were perceived as attacking people. Such a trainwreck ensued that the mods manually moved the comments to their own post. Based on the karma scores and what was said,9 I count it as a victory.

On 31 May 2019, a draft of a new Less Wrong FAQ included a link to "The Categories Were Made for Man, Not Man for the Categories" as one of Scott Alexander's best essays. I argued that it would be better to cite almost literally any other Slate Star Codex post (most of which, I agreed, were exemplary). I claimed that the following disjunction was true: either Alexander's claim that "There's no rule of rationality saying that [one] shouldn't" "accept an unexpected [X] or two deep inside the conceptual boundaries of what would normally be considered [Y] if it'll save someone's life" was a blatant lie, or I could call it a blatant lie because no rule of rationality says I shouldn't draw the category boundaries of "blatant lie" that way. Ruby Bloom, the new moderator who wrote the draft, was persuaded, and "... Not Man for the Categories" was not included in the final FAQ. Another "victory."

But "victories" weren't particularly comforting when I resented this becoming a political slapfight at all. I wrote to Anna and Steven Kaas (another old-timer who I was trying to "recruit" to my side of the civil war). In "What You Can't Say", Paul Graham had written, "The problem is, there are so many things you can't say. If you said them all you'd have no time left for your real work." But surely that depends on what your real work is. For someone like Paul Graham, whose goal was to make a lot of money writing software, "Don't say it" (except in this one meta-level essay) was probably the right choice. But someone whose goal is to improve Society's collective ability to reason should probably be doing more fighting than Paul Graham (although still preferably on the meta- rather than object-level), because political restrictions on speech and thought directly hurt the mission of "improve our collective ability to reason" in a way that they don't hurt the mission of "make a lot of money writing software."

I said I didn't know if either of them had caught the "Yes Requires the Possibility" trainwreck, but wasn't it terrifying that the person who objected to my innocuous philosophy comment was a MIRI research associate? Not to demonize that commenter, because I was just as bad (if not worse) in 2008. The difference was that in 2008, we had a culture that could beat it out of me.

Steven objected that tractability and side effects matter, not just effect on the mission considered in isolation. For example, the Earth's gravitational field directly impedes NASA's mission, and doesn't hurt Paul Graham, but both NASA and Paul Graham should spend the same amount of effort (viz., zero) trying to reduce the Earth's gravity.

I agreed that tractability needed to be addressed, but the situation felt analogous to being in a coal mine in which my favorite of our canaries had just died. Caliphate officials (Eliezer, Scott, Anna) and loyalists (Steven) were patronizingly consoling me: sorry, I know you were really attached to that canary, but it's just a bird; it's not critical to the coal-mining mission. I agreed that I was unreasonably attached to that particular bird, but that's not why I expected them to care. The problem was what the dead canary was evidence of: if you're doing systematically correct reasoning, you should be able to get the right answer even when the question doesn't matter. (The causal graph is the fork "canary death ← mine gas → danger" rather than the direct link "canary death → danger".) Ben and Michael and Jessica claimed to have spotted their own dead canaries. I felt like the old-timer Rationality Elders should have been able to get on the same page about the canary-count issue?

Math and Wellness Month ended up being mostly a failure: the only math I ended up learning was a fragment of group theory and some probability theory that later turned out to be deeply relevant to understanding sex differences. So much for taking a break.

In June 2019, I made a linkpost on Less Wrong to Tal Yarkoni's "No, It's Not The Incentives—It's you", about how professional scientists should stop using career incentives as an excuse for doing poor science. It generated a lot of discussion.

In an email (Subject: "LessWrong.com is dead to me"), Jessica identified Less Wrong moderator Raymond Arnold's comments as her last straw. Jessica wrote:

LessWrong.com is a place where, if the value of truth conflicts with the value of protecting elites' feelings and covering their asses, the second value will win.

Trying to get LessWrong.com to adopt high-integrity norms is going to fail, hard, without a lot of conflict. (Enforcing high-integrity norms is like violence; if it doesn't work, you're not doing enough of it). People who think being exposed as fraudulent (or having their friends exposed as fraudulent) is a terrible outcome, are going to actively resist high-integrity discussion norms.

Posting on Less Wrong made sense as harm-reduction, but the only way to get people to stick up for truth would be to convert them to a whole new worldview, which would require a lot of in-person discussions. She brought up the idea of starting a new forum to replace Less Wrong.

Ben said that trying to discuss with the Less Wrong mod team would be a good intermediate step, after we clarified to ourselves what was going on; it might be "good practice in the same way that the Eliezer initiative was good practice." The premise should be, "If this is within the Overton window for Less Wrong moderators, there's a serious confusion on the conditions required for discourse"—scapegoating individuals wasn't part of it. He was less optimistic about harm reduction; participating on the site was implicitly endorsing it by submitting to the rule of the karma and curation systems.

"Riley" expressed sadness about how the discussion on "The Incentives" demonstrated that the community they loved—including dear friends—was in a bad way. Michael (in a separate private discussion) had said he was glad to hear about the belief-update. "Riley" said that Michael saying that also made them sad, because it seemed discordant to be happy about sad news. Michael wrote:

I['m] sorry it made you sad. From my perspective, the question is no[t] "can we still be friends with such people", but "how can we still be friends with such people" and I am pretty certain that understanding their perspective [is] an important part of the answer. If clarity seems like death to them and like life to us, and we don't know this, IMHO that's an unpromising basis for friendship.


I got into a scuffle with Ruby Bloom on his post on "Causal Reality vs. Social Reality". I wrote what I thought was a substantive critique, but Ruby complained that my tone was too combative, and asked for more charity and collaborative truth-seeking10 in any future comments.

(My previous interaction with Ruby had been my challenge to "... Not Man for the Categories" appearing on the Less Wrong FAQ. Maybe he couldn't let me "win" again so quickly?)

I emailed the posse about the thread, on the grounds that gauging the psychology of the mod team was relevant to our upcoming Voice vs. Exit choices. Meanwhile on Less Wrong, Ruby kept doubling down:

[I]f the goal is everyone being less wrong, I think some means of communicating are going to be more effective than others. I, at least, am a social monkey. If I am bluntly told I am wrong (even if I agree, even in private—but especially in public), I will feel attacked (if only at the S1 level), threatened (socially), and become defensive. It makes it hard to update and it makes it easy to dislike the one who called me out. [...]

[...]

Even if you wish to express that someone is wrong, I think this is done more effectively if one simultaneously continues to implicitly express "I think there is still some prior that you are correct and I curious to hear your thoughts", or failing that "You are very clearly wrong here yet I still respect you as a thinker who is worth my time to discourse with." [...] There's an icky thing here I feel like for there to be productive and healthy discussion you have to act as though at least one of the above statements is true, even if it isn't.

"Wow, he's really overtly arguing that people should lie to him to protect his feelings," Ben commented via email. I would later complain to Anna that Ruby's profile said he was one of two people to have volunteered for CfAR on three continents. If this was the level of performance we could expect from veteran CfAR participants, what was CfAR for?

I replied to Ruby that you could just directly respond to your interlocutor's arguments. Whether you respect them as a thinker is off-topic. "You said X, but this is wrong because of Y" isn't a personal attack! I thought it was ironic that this happened on a post that was explicitly about causal vs. social reality; it's possible that I wouldn't have been so rigid about this if it weren't for that prompt.

(On reviewing the present post prior to publication, Ruby writes that he regrets his behavior during this exchange.)

Jessica ended up writing a post, "Self-Consciousness Wants Everything to Be About Itself", arguing that tone arguments are mainly about people silencing discussion of actual problems in order to protect their feelings. She used as a central example a case study of a college official crying and saying that she "felt attacked" in response to complaints about her office being insufficiently supportive of a racial community.

Jessica was surprised by how well it worked, judging by Ruby mentioning silencing in a subsequent comment to me (plausibly influenced by Jessica's post) and by an exchange between Ray and Ruby that she thought was "surprisingly okay".

From this, Jessica derived the moral that when people are doing something that seems obviously terrible and in bad faith, it can help to publicly explain why the abstract thing is bad, without accusing anyone. This made sense because people didn't want to be held to standards that other people aren't being held to: a call-out directed at oneself personally could be selective enforcement, but a call-out of the abstract pattern invited changing one's behavior if the new equilibrium looked better.

Michael said that part of the reason this worked was because it represented a clear threat of scapegoating without actually scapegoating and without surrendering the option to do so later; it was significant that Jessica's choice of example positioned her on the side of the powerful social-justice coalition.


On 4 July 2019, Scott Alexander published "Some Clarifications on Rationalist Blogging", disclaiming any authority as a "rationalist" leader. ("I don't want to claim this blog is doing any kind of special 'rationality' work beyond showing people interesting problems [...] Insofar as [Slate Star Codex] makes any pretensions to being 'rationalist', it's a rationalist picnic and not a rationalist monastery.") I assumed this was inspired by Ben's request back in March that Scott "alter the beacon" so as to not confuse people about what the current-year community was. I appreciated it.


Jessica published "The AI Timelines Scam", arguing that the recent prominence of "short" (e.g., 2030) timelines to transformative AI was better explained by political factors than by technical arguments: just as in previous decades, people had incentives to bluff and exaggerate about the imminence of AGI in order to attract resources to their own project.

(Remember, this was 2019. After seeing what GPT-3, DALL-E, PaLM, &c. could do during the "long May 2020", it now looks to me that the short-timelines people had better intuitions than Jessica gave them credit for.)

I still sympathized with the pushback from Caliphate supporters against using "scam"/"fraud"/"lie"/&c. language to include motivated elephant-in-the-brain-like distortions. I conceded that this was a boring semantic argument, but I feared that until we invented better linguistic technology, the boring semantic argument was going to continue sucking up discussion bandwidth with others.

"Am I being too tone-policey here?" I asked the posse. "Is it better if I explicitly disclaim, 'This is marketing advice; I'm not claiming to be making a substantive argument'?" (Subject: "Re: reception of 'The AI Timelines Scam' is better than expected!")

Ben replied, "What exactly is a scam, if it's not misinforming people systematically about what you have to offer, in a direction that moves resources towards you?" He argued that investigations of financial fraud focus on false promises about money, rather than the psychological minutiæ of the perp's motives.

I replied that the concept of mens rea did seem necessary for maintaining good incentives, at least in some contexts. The law needs to distinguish between accidentally hitting a pedestrian in one's car ("manslaughter") and premeditated killing ("first-degree murder"), because traffic accidents are significantly less disincentivizable than offing one's enemies. (Anyone who drives at all is taking on some nonzero risk of committing vehicular manslaughter.) The manslaughter example was simpler than misinformation-that-moves-resources,11 and it might not be easy for the court to determine "intent", but I didn't see what would reverse the weak principle that intent sometimes matters.

Ben replied that what mattered in the determination of manslaughter vs. murder was whether there was long-horizon optimization power toward the outcome of someone's death, not what sentiments the killer rehearsed in their working memory.

On a phone call later, Michael made an analogy between EA and Catholicism. The Pope was fraudulent, because the legitimacy of the Pope's position (and his claims to power and resources) rested on the pretense that he had a direct relationship with God, which wasn't true, and the Pope had to know on some level that it wasn't true. (I agreed that this usage of "fraud" made sense to me.) In Michael's view, Ben's charges against GiveWell were similar: GiveWell's legitimacy rested on the pretense that they were making decisions based on numbers, and they had to know at some level that they weren't doing that.


Ruby wrote a document about ways in which one's speech could harm people, which was discussed in the comments of a draft Less Wrong post by some of our posse members and some of the Less Wrong mods.12

Ben wrote:

What I see as under threat is the ability to say in a way that's actually heard, not only that opinion X is false, but that the process generating opinion X is untrustworthy, and perhaps actively optimizing in an objectionable direction. Frequently, attempts to say this are construed primarily as moves to attack some person or institution, pushing them into the outgroup. Frequently, people suggest to me an "equivalent" wording with a softer tone, which in fact omits important substantive criticisms I mean to make, while claiming to understand what's at issue.

Ray Arnold replied:

My core claim is: "right now, this isn't possible, without a) it being heard by many people as an attack, b) without people having to worry that other people will see it as an attack, even if they don't."

It seems like you see this something as "there's a precious thing that might be destroyed" and I see it as "a precious thing does not exist and must be created, and the circumstances in which it can exist are fragile." It might have existed in the very early days of LessWrong. But the landscape now is very different than it was then. With billions of dollars available and at stake, what worked then can't be the same thing as what works now.

(!!)13

Jessica pointed this out as a step towards discussing the real problem (Subject: "progress towards discussing the real thing??"). She elaborated in the secret thread: now that the "EA" scene was adjacent to real-world money and power, people were incentivized to protect their reputations (and beliefs related to their reputations) in anti-epistemic ways, in a way that they wouldn't if the scene were still just a philosophy club. This was catalyzing a shift of norms from "that which can be destroyed by the truth, should be" towards protecting feelings—where "protecting feelings" was actually about protecting power. The fact that the scene was allocating billions of dollars made it more important for public discussions to reach the truth, compared to philosophy club—but it also increased the likelihood of obfuscatory behavior that philosophy-club norms (like "assume good faith") didn't account for. We might need to extend philosophy-club norms to take into account the possibility of adversarial action: there's a reason that courts of law don't assume good faith. We didn't want to disproportionately punish people for getting caught up in obfuscatory patterns; that would just increase the incentive to obfuscate. But we did need some way to reveal what was going on.

In email, Jessica acknowledged that Ray had a point that it was confusing to use court-inspired language if we didn't intend to blame and punish people. Michael said that court language was our way to communicate "You don't have the option of non-engagement with the complaints that are being made." (Courts can summon people; you can't ignore a court summons the way you can ignore ordinary critics.)

Michael said that we should also develop skill in using social-justicey blame language, as was used against us, harder, while we still thought of ourselves as trying to correct people's mistakes rather than being in a conflict against the Blight. "Riley" said that this was a terrifying you-have-become-the-abyss suggestion; Ben thought it was obviously a good idea.

I was horrified by the extent to which Less Wrong moderators (!) seemed to be explicitly defending "protect feelings" norms. Previously, I had mostly been seeing the present struggle through the lens of my idiosyncratic Something to Protect as a simple matter of Bay Area political correctness. I was happy to have Michael, Ben, and Jessica as allies, but I hadn't been seeing the Blight as a unified problem. Now I was seeing something.

An in-person meeting was arranged for 23 July 2019 at the Less Wrong office, with Ben, Jessica, me, and most of the Less Wrong team (Ray, Ruby, Oliver Habryka, Vaniver, Jim Babcock). I don't have notes and don't really remember what was discussed in enough detail to faithfully recount it.14 I ended up crying at one point and left the room for a while.

The next day, I asked Ben and Jessica for their takeaways via email (Subject: "peace talks outcome?"). Jessica said that I was a "helpful emotionally expressive and articulate victim" and that there seemed to be a consensus that people like me should be warned somehow that Less Wrong wasn't doing fully general sanity-maximization anymore. (Because community leaders were willing to sacrifice, for example, ability to discuss non-AI heresies in order to focus on sanity about AI in particular while maintaining enough mainstream acceptability and power.)

I said that for me and my selfish perspective, the main outcome was finally shattering my "rationalist" social identity. I needed to exhaust all possible avenues of appeal before it became real to me. The morning after was the first for which "rationalists ... them" felt more natural than "rationalists ... us".

A Beleaguered Ally Under Fire (July–August 2019)

Michael's reputation in the community, already not what it once was, continued to be debased even further.

The local community center, the Berkeley REACH,15 was conducting an investigation as to whether to exclude Michael (which was mostly moot, as he didn't live in the Bay Area). When I heard that the committee conducting the investigation was "very close to releasing a statement", I wrote to them:

I've been collaborating with Michael a lot recently, and I'm happy to contribute whatever information I can to make the report more accurate. What are the charges?

They replied:

To be clear, we are not a court of law addressing specific "charges." We're a subcommittee of the Berkeley REACH Panel tasked with making decisions that help keep the space and the community safe.

I replied:

Allow me to rephrase my question about charges. What are the reasons that the safety of the space and the community require you to write a report about Michael? To be clear, a community that excludes Michael on inadequate evidence is one where I feel unsafe.

We arranged a call, during which I angrily testified that Michael was no threat to the safety of the space and the community. This would have been a bad idea if it were the cops, but in this context, I figured my political advocacy couldn't hurt.

Concurrently, I got into an argument with Kelsey Piper about Michael after she wrote on Discord that her "impression of Vassar's threatening schism is that it's fundamentally about Vassar threatening to stir shit up until people stop socially excluding him for his bad behavior." I didn't think that was what the schism was about (Subject: "Michael Vassar and the theory of optimal gossip").

In the course of litigating Michael's motivations (the details of which are not interesting enough to summarize here), Kelsey mentioned that she thought Michael had done immense harm to me—that my models of the world and ability to reason were worse than they were a year ago. I thanked her for the concern, and asked if she could be more specific.

She said she was referring to my ability to predict consensus and what other people believe. I expected people to be convinced by arguments that they found not only unconvincing, but so unconvincing they didn't see why I would bother. I believed things to be in obvious violation of widespread agreement that everyone else thought were not. My shocked indignation at other people's behavior indicated a poor model of social reality.

I considered this an insightful observation about a way in which I'm socially retarded. I had had similar problems with school. We're told that the purpose of school is education (to the extent that most people think of school and education as synonyms), but the consensus behavior is "sit in lectures and trade assignments for grades." Faced with what I saw as a contradiction between the consensus narrative and the consensus behavior, I would assume that the narrative was the "correct" version, and so I spent a lot of time trying to start conversations about math with everyone and then getting indignant when they'd say, "What class is this for?" Math isn't for classes; it's the other way around, right?

Empirically, no! But I had to resolve the contradiction between narrative and reality somehow, and if my choices were "People are mistakenly failing to live up to the narrative" and "Everybody knows the narrative is a lie; it would be crazy to expect people to live up to it", the former had been more appealing.

It was the same thing here. Kelsey said that it was predictable that Yudkowsky wouldn't make a public statement, even one as basic as "category boundaries should be drawn for epistemic and not instrumental reasons," because his experience of public statements was that they'd be taken out of context and used against MIRI by the likes of /r/SneerClub. This wasn't an update at all. (Everyone at "Arcadia" had agreed, in the house discussion in April.) Vassar's insistence that Eliezer be expected to do something that he obviously was never going to do had caused me to be confused and surprised by reality.16

Kelsey seemed to be taking it as obvious that Eliezer Yudkowsky's public behavior was optimized to respond to the possibility of political attacks from people who hate him anyway, and not the actuality of thousands of words of careful arguments appealing to his own writings from ten years ago. Very well. Maybe it was obvious. But if so, I had no reason to care what Eliezer Yudkowsky said, because not provoking SneerClub isn't truth-tracking, and careful arguments are. This was a huge surprise to me, even if Kelsey knew better.

What Kelsey saw as "Zack is losing his ability to model other people and I'm worried about him," I thought Ben and Jessica would see as "Zack is angry about living in simulacrum level 3 and we're worried about everyone else."

I did think that Kelsey was mistaken about how much causality to attribute to Michael's influence, rather than to me already being socially retarded. From my perspective, validation from Michael was merely the catalyst that excited me from confused-and-sad to confused-and-socially-aggressive-about-it. The latter phase revealed a lot of information, and not just to me. Now I was ready to be less confused—after I was done grieving.

Later, talking in person at "Arcadia", Kelsey told me that the REACH was delaying its release of its report about Michael because someone whose identity she could not disclose had threatened to sue. As far as my interest in defending Michael went, I counted this as short-term good news (because the report wasn't being published for now) but longer-term bad news (because the report must be a hit piece if Michael's mysterious ally was trying to hush it).

When I mentioned this to Michael on Signal on 3 August 2019, he replied:

The person is me, the whole process is a hit piece, literally, the investigation process and not the content. Happy to share the latter with you. You can talk with Ben about appropriate ethical standards.

In retrospect, I feel dumb for not guessing that Michael's mysterious ally was Michael himself. This kind of situation is an example of how norms protecting confidentiality distort information; Kelsey felt obligated to obfuscate any names connected to potential litigation, which led me to the infer the existence of a nonexistent person. I can't say I never introduce this kind of distortion myself (for I, too, am bound by norms), but when I do, I feel dirty about it.

As far as appropriate ethical standards go, I didn't approve of silencing critics with lawsuit threats, even while I agreed with Michael that "the process is the punishment." I imagine that if the REACH wanted to publish a report about me, I would expect to defend myself in public, having faith that the beautiful weapon of my Speech would carry the day against a corrupt community center—or for that matter, against /r/SneerClub.

This is arguably one of my more religious traits. Michael and Kelsey are domain experts and probably know better.

An Poignant-to-Me Anecdote That Fits Here Chronologically But Doesn't Particularly Foreshadow Anything (August 2019)

While visiting "Arcadia", "Meredith" and Mike's son (age 2¾ years) asked me, "Why are you a boy?"

After a long pause, I said, "Yes," as if I had misheard the question as "Are you a boy?" I think it was a motivated mishearing: it was only after I answered that I consciously realized that's not what the kid asked.

I think I would have preferred to say, "Because I have a penis, like you." But it didn't seem appropriate.

Philosophy Blogging Interlude! (August–October 2019)

I wanted to finish the memoir-post mourning the "rationalists", but I still felt psychologically constrained. So instead, I mostly turned to a combination of writing bitter and insulting comments whenever I saw someone praise the "rationalists" collectively, and—more philosophy blogging!

In August 2019's "Schelling Categories, and Simple Membership Tests", I explained a nuance that had only merited a passing mention in "Where to Draw the Boundaries?": sometimes you might want categories for different agents to coordinate on, even at the cost of some statistical "fit." (This was generalized from a "pro-trans" argument that had occurred to me, that self-identity is an easy Schelling point when different people disagree about what "gender" they perceive someone as.)

In September 2019's "Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists", I presented a toy mathematical model of how censorship distorts group beliefs. I was surprised by how well-received it was (high karma, Curated within a few days, later included in the Best-of-2019 collection), especially given that it was explicitly about politics (albeit at a meta level, of course). Ben and Jessica had discouraged me from bothering when I sent them a draft. (Jessica said that it was obvious even to ten-year-olds that partisan politics distorts impressions by filtering evidence. "[D]o you think we could get a ten-year-old to explain it to Eliezer Yudkowsky?" I asked.)

In October 2019's "Algorithms of Deception!", I exhibited some toy Python code modeling different kinds of deception. If a function faithfully passes its observations as input to another function, the second function can construct a well-calibrated probability distribution. But if the first function outright fabricates evidence, or selectively omits some evidence, or gerrymanders the categories by which it interprets its observations as evidence, the second function computes a worse probability distribution.

Also in October 2019, in "Maybe Lying Doesn't Exist", I replied to Scott Alexander's "Against Lie Inflation", which was itself a generalized rebuke of Jessica's "The AI Timelines Scam". Scott thought Jessica was wrong to use language like "lie", "scam", &c. to describe someone being (purportedly) motivatedly wrong, but not necessarily consciously lying.

I was furious when "Against Lie Inflation" came out. (Furious at what I perceived as hypocrisy, not because I particularly cared about defending Jessica's usage.) Oh, so now Scott agreed that making language less useful is a problem?! But on further consideration, I realized he was actually being consistent in admitting appeals to consequences as legitimate. In objecting to the expanded definition of "lying", Alexander was counting "everyone is angrier" (because of more frequent accusations of lying) as a cost. In my philosophy, that wasn't a legitimate cost. (If everyone is lying, maybe people should be angry!)

The Caliph's Madness (August and November 2019)

I continued to note signs of contemporary Yudkowsky not being the same author who wrote the Sequences. In August 2019, he Tweeted:

I am actively hostile to neoreaction and the alt-right, routinely block such people from commenting on my Twitter feed, and make it clear that I do not welcome support from those quarters. Anyone insinuating otherwise is uninformed, or deceptive.

I argued that the people who smear him as a right-wing Bad Guy do so in order to extract these kinds of statements of political alignment as concessions; his own timeless decision theory would seem to recommend ignoring them rather than paying even this small Danegeld.

When I emailed the posse about it begging for Likes (Subject: "can't leave well enough alone"), Jessica said she didn't get my point. If people are falsely accusing you of something (in this case, of being a right-wing Bad Guy), isn't it helpful to point out that the accusation is false? It seemed like I was advocating for self-censorship on the grounds that speaking up helps the false accusers. But it also helps bystanders (by correcting the misapprehension) and hurts the false accusers (by demonstrating to bystanders that the accusers are making things up). By linking to "Kolmogorov Complicity and the Parable of Lightning" in my replies, I seemed to be insinuating that Yudkowsky was under some sort of duress, but this wasn't spelled out: if Yudkowsky would face social punishment for advancing right-wing opinions, did that mean he was under such duress that saying anything at all would be helping the oppressors?

The paragraph from "Kolmogorov Complicity" that I was thinking of was (bolding mine):

Some other beliefs will be found to correlate heavily with lightning-heresy. Maybe atheists are more often lightning-heretics; maybe believers in global warming are too. The enemies of these groups will have a new cudgel to beat them with, "If you believers in global warming are so smart and scientific, how come so many of you believe in lightning, huh?" Even the savvy Kolmogorovs within the global warming community will be forced to admit that their theory just seems to attract uniquely crappy people. It won't be very convincing. Any position correlated with being truth-seeking and intelligent will be always on the retreat, having to forever apologize that so many members of their movement screw up the lightning question so badly.

I perceived a pattern where people who are in trouble with the orthodoxy buy their own safety by denouncing other heretics: not just disagreeing with the other heretics because they are mistaken, which would be right and proper Discourse, but denouncing them ("actively hostile to") as a way of paying Danegeld.

Suppose there are five true heresies, but anyone who's on the record as believing more than one gets burned as a witch. Then it's impossible to have a unified rationalist community, because people who want to talk about one heresy can't let themselves be seen in the company of people who believe another. That's why Scott Alexander couldn't get the philosophy of categorization right in full generality, even though his writings revealed an implicit understanding of the correct way,17 and he and I had a common enemy in the social-justice egregore. He couldn't afford to. He'd already spent his Overton budget on anti-feminism.

Alexander (and Yudkowsky and Anna and the rest of the Caliphate) seemed to accept this as an inevitable background fact of existence, like the weather. But I saw a Schelling point off in the distance where us witches stick together for Free Speech,18 and it was tempting to try to jump there. (It would probably be better if there were a way to organize just the good witches, and exclude all the Actually Bad witches, but the Sorites problem on witch Badness made that hard to organize without falling back to the one-heresy-per-thinker equilibrium.)

Jessica thought my use of "heresy" was conflating factual beliefs with political movements. (There are no intrinsically "right wing" facts.) I agreed that conflating political positions with facts would be bad. I wasn't interested in defending the "alt-right" (whatever that means) broadly. But I had learned stuff from reading far-right authors (most notably Mencius Moldbug) and from talking with "Thomas". I was starting to appreciate what Michael had said about "Less precise is more violent" back in April when I was talking about criticizing "rationalists".

Jessica asked if my opinion would change depending on whether Yudkowsky thought neoreaction was intellectually worth engaging with. (Yudkowsky had said years ago that Moldbug was low quality.)

I would never fault anyone for saying "I vehemently disagree with what little I've read and/or heard of this author." I wasn't accusing Yudkowsky of being insincere.

What I did think was that the need to keep up appearances of not being a right wing Bad Guy was a serious distortion of people's beliefs, because there are at least a few questions of fact where believing the correct answer can, in the political environment of the current year, be used to paint one as a right-wing Bad Guy. I would have hoped for Yudkowsky to notice that this is a rationality problem and to not actively make the problem worse. I was counting "I do not welcome support from those quarters" as making the problem worse insofar as it would seem to imply that if I thought I'd learned valuable things from Moldbug, that made me less welcome in Yudkowsky's fiefdom.

Yudkowsky certainly wouldn't endorse "Even learning things from these people makes you unwelcome" as stated, but "I do not welcome support from those quarters" still seemed like a pointlessly partisan silencing/shunning attempt, when one could just as easily say, "I'm not a neoreactionary, and if some people who read me are, that's obviously not my fault."

Jessica asked if Yudkowsky denouncing neoreaction and the alt-right would still seem harmful, if he were to also to acknowledge, e.g., racial IQ differences?

I agreed that that would be better, but realistically, I didn't see why Yudkowsky should want to poke that hornet's nest. This was the tragedy of recursive silencing: if you can't afford to engage with heterodox ideas, either you become an evidence-filtering clever arguer, or you're not allowed to talk about anything except math. (Not even the relationship between math and human natural language, as we had found out recently.)

It was as if there was a "Say Everything" attractor and a "Say Nothing" attractor, and my incentives were pushing me towards the "Say Everything" attractor—but that was only because I had Something to Protect in the forbidden zone and I was a decent programmer (who could therefore expect to be employable somewhere, just as James Damore eventually found another job). Anyone in less extreme circumstances would find themselves pushed toward the "Say Nothing" attractor.

It was instructive to compare Yudkowsky's new disavowal of neoreaction with one from 2013, in response to a TechCrunch article citing former MIRI employee Michael Anissimov's neoreactionary blog More Right:19

"More Right" is not any kind of acknowledged offspring of Less Wrong nor is it so much as linked to by the Less Wrong site. We are not part of a neoreactionary conspiracy. We are and have been explicitly pro-Enlightenment, as such, under that name. Should it be the case that any neoreactionary is citing me as a supporter of their ideas, I was never asked and never gave my consent. [...]

Also to be clear: I try not to dismiss ideas out of hand due to fear of public unpopularity. However I found Scott Alexander's takedown of neoreaction convincing and thus I shrugged and didn't bother to investigate further.

My criticism regarding negotiating with terrorists did not apply to the 2013 disavowal. More Right was brand encroachment on Anissimov's part that Yudkowsky had a legitimate interest in policing, and the "I try not to dismiss ideas out of hand" disclaimer importantly avoided legitimizing McCarthyist persecution.

The question was, what had specifically happened in the last six years to shift Yudkowsky's opinion on neoreaction from (paraphrased) "Scott says it's wrong, so I stopped reading" to (verbatim) "actively hostile"? Note especially the inversion from (both paraphrased) "I don't support neoreaction" (fine, of course) to "I don't even want them supporting me" (which was bizarre; humans with very different views on politics nevertheless have a common interest in not being transformed into paperclips).

Did Yudkowsky get new information about neoreaction's hidden Badness parameter sometime between 2013 and 2019, or did moral coercion from the left intensify (because Trump and because Berkeley)? My bet was on the latter.


However it happened, it didn't seem like the brain damage was limited to "political" topics, either. In November 2019, we saw another example of Yudkowsky destroying language for the sake of politeness, this time the context of him trying to wirehead his fiction subreddit by suppressing criticism-in-general.

That's my characterization, of course: the post itself talks about "reducing negativity". In a followup comment, Yudkowsky wrote (bolding mine):

On discussion threads for a work's particular chapter, people may debate the well-executedness of some particular feature of that work's particular chapter. Comments saying that nobody should enjoy this whole work are still verboten. Replies here should still follow the etiquette of saying "Mileage varied: I thought character X seemed stupid to me" rather than saying "No, character X was actually quite stupid."

But ... "I thought X seemed Y to me"20 and "X is Y" do not mean the same thing! The map is not the territory. The quotation is not the referent. The planning algorithm that maximizes the probability of doing a thing is different from the algorithm that maximizes the probability of having "tried" to do the thing. If my character is actually quite stupid, I want to believe that my character is actually quite stupid.

It might seem like a little thing of no significance—requiring "I" statements is commonplace in therapy groups and corporate sensitivity training—but this little thing coming from Eliezer Yudkowsky setting guidelines for an explicitly "rationalist" space made a pattern click. If everyone is forced to only make claims about their map ("I think", "I feel") and not make claims about the territory (which could be construed to call other people's maps into question and thereby threaten them, because disagreement is disrespect), that's great for reducing social conflict but not for the kind of collective information processing that accomplishes cognitive work,21 like good literary criticism. A rationalist space needs to be able to talk about the territory.

To be fair, the same comment I quoted also lists "Being able to consider and optimize literary qualities" as one of the major considerations to be balanced. But I think (I think) it's also fair to note that (as we had seen on Less Wrong earlier that year), lip service is cheap. It's easy to say, "Of course I don't think politeness is more important than truth," while systematically behaving as if you did.

"Broadcast criticism is adversely selected for critic errors," Yudkowsky wrote in the post on reducing negativity, correctly pointing out that if a work's true level of mistakenness is M, the i-th commenter's estimate of mistakenness has an error term of Ei, and commenters leave a negative comment when their estimate M + Ei is greater than their threshold for commenting Ti, then the comments that get posted will have been selected for erroneous criticism (high Ei) and commenter chattiness (low Ti).

I can imagine some young person who liked Harry Potter and the Methods being intimidated by the math notation and indiscriminately accepting this wisdom from the great Eliezer Yudkowsky as a reason to be less critical, specifically. But a somewhat less young person who isn't intimidated by math should notice that this is just regression to the mean. The same argument applies to praise!

What I would hope for from a rationality teacher and a rationality community, would be efforts to instill the general skill of modeling things like regression to the mean and selection effects, as part of the general project of having a discourse that does collective information-processing.

And from the way Yudkowsky writes these days, it looks like he's ... not interested in collective information-processing? Or that he doesn't actually believe that's a real thing? "Credibly helpful unsolicited criticism should be delivered in private," he writes! I agree that the positive purpose of public criticism isn't solely to help the author. (If it were, there would be no reason for anyone but the author to read it.) But readers do benefit from insightful critical commentary. (If they didn't, why would they read the comments section?) When I read a story and am interested in reading the comments about a story, it's because I'm interested in the thoughts of other readers, who might have picked up subtleties I missed. I don't want other people to self-censor comments on any plot holes or Fridge Logic they noticed for fear of dampening someone else's enjoyment or hurting the author's feelings.

Yudkowsky claims that criticism should be given in private because then the target "may find it much more credible that you meant only to help them, and weren't trying to gain status by pushing them down in public." I'll buy this as a reason why credibly altruistic unsolicited criticism should be delivered in private.22 Indeed, meaning only to help the target just doesn't seem like a plausible critic motivation in most cases. But the fact that critics typically have non-altruistic motives, doesn't mean criticism isn't helpful. In order to incentivize good criticism, you want people to be rewarded with status for making good criticisms. You'd have to be some sort of communist to disagree with this!23

There's a striking contrast between the Yudkowsky of 2019 who wrote the "Reducing Negativity" post, and an earlier Yudkowsky (from even before the Sequences) who maintained a page on Crocker's rules: if you declare that you operate under Crocker's rules, you're consenting to other people optimizing their speech for conveying information rather than being nice to you. If someone calls you an idiot, that's not an "insult"; they're just informing you about the fact that you're an idiot, and you should probably thank them for the tip. (If you were an idiot, wouldn't you be better off knowing that?)

It's of course important to stress that Crocker's rules are opt-in on the part of the receiver; it's not a license to unilaterally be rude to other people. Adopting Crocker's rules as a community-level norm on an open web forum does not seem like it would end well.

Still, there's something precious about a culture where people appreciate the obvious normative ideal underlying Crocker's rules, even if social animals can't reliably live up to the normative ideal. Speech is for conveying information. People can say things—even things about me or my work—not as a command, or as a reward or punishment, but just to establish a correspondence between words and the world: a map that reflects a territory.

Appreciation of this obvious normative ideal seems strikingly absent from Yudkowsky's modern work—as if he's given up on the idea that reasoning in public is useful or possible. His Less Wrong commenting guidelines declare, "If it looks like it would be unhedonic to spend time interacting with you, I will ban you from commenting on my posts." The idea that people who are unhedonic to interact with might have intellectually substantive criticisms that the author has a duty to address does not seem to have crossed his mind.

The "Reducing Negativity" post also warns against the failure mode of attempted "author telepathy": attributing bad motives to authors and treating those attributions as fact without accounting for uncertainty or distinguishing observations from inferences. I should be explicit, then: when I say negative things about Yudkowsky's state of mind, like it's "as if he's given up on the idea that reasoning in public is useful or possible", that's a probabilistic inference, not a certain observation.

But I think making probabilistic inferences is ... fine? The sentence "Credibly helpful unsolicited criticism should be delivered in private" sure does look to me like text generated by a state of mind that doesn't believe that reasoning in public is useful or possible. I think that someone who did believe in public reason would have noticed that criticism has information content whose public benefits might outweigh its potential to harm an author's reputation or feelings. If you think I'm getting this inference wrong, feel free to let me and other readers know why in the comments.

A Worthy Critic At Last (November 2019)

I received an interesting email comment on my philosophy-of-categorization thesis from MIRI researcher Abram Demski. Abram asked: ideally, shouldn't all conceptual boundaries be drawn with appeal-to-consequences? Wasn't the problem just with bad (motivated, shortsighted) appeals to consequences? Agents categorize in order to make decisions. The best classifier for an application depends on the costs and benefits. As a classic example, prey animals need to avoid predators, so it makes sense for their predator-detection classifiers to be configured such that they jump away from every rustling in the bushes, even if it's usually not a predator.

I had thought of the "false positives are better than false negatives when detecting predators" example as being about the limitations of evolution as an AI designer: messy evolved animal brains don't track probability and utility separately the way a cleanly-designed AI could. As I had explained in "... Boundaries?", it made sense for consequences to motivate what variables you paid attention to. But given the subspace that's relevant to your interests, you want to run an "epistemically legitimate" clustering algorithm on the data you see there, which depends on the data, not your values. Ideal probabilistic beliefs shouldn't depend on consequences.

Abram didn't think the issue was so clear-cut. Where do "probabilities" come from, in the first place? The reason we expect something like Bayesianism to be an attractor among self-improving agents is because probabilistic reasoning is broadly useful: epistemology can be derived from instrumental concerns. He agreed that severe wireheading issues potentially arise if you allow consequentialist concerns to affect your epistemics.

But the alternative view had its own problems. If your AI consists of a consequentialist module that optimizes for utility in the world, and an epistemic module that optimizes for the accuracy of its beliefs, that's two agents, not one: how could that be reflectively coherent? You could, perhaps, bite the bullet here, for fear that consequentialism doesn't propagate itself and that wireheading was inevitable. On this view, Abram explained, "Agency is an illusion which can only be maintained by crippling agents and giving them a split-brain architecture where an instrumental task-monkey does all the important stuff while an epistemic overseer supervises." Whether this view was ultimately tenable or not, this did show that trying to forbid appeals-to-consequences entirely led to strange places.

I didn't immediately have an answer for Abram, but I was grateful for the engagement. (Abram was clearly addressing the real philosophical issues, and not just trying to mess with me in the way that almost everyone else in Berkeley was trying to mess with me.)

Writer's Block (November 2019)

I wrote to Ben about how I was still stuck on writing the grief-memoir. My plan had been to tell the story of the Category War while Glomarizing about the content of private conversations, then offer Scott and Eliezer pre-publication right of reply (because it's only fair to give your former-hero-current-frenemies warning when you're about to publicly call them intellectually dishonest), then share it to Less Wrong and the /r/TheMotte culture war thread, and then I would have the emotional closure to move on with my life (learn math, go to gym, chop wood, carry water).

The reason it should have been safe to write was because it's good to explain things. It should be possible to say, "This is not a social attack; I'm not saying 'rationalists Bad, Yudkowsky Bad'; I'm just trying to tell the true story about why I've been upset this year, including addressing counterarguments for why some would argue that I shouldn't be upset, why other people could be said to be behaving 'reasonably' given their incentives, why I nevertheless wish they'd be braver and adhere to principle rather than 'reasonably' following incentives, &c."

So why couldn't I write? Was it that I didn't know how to make "This is not a social attack" credible? Maybe because ... it wasn't true?? I was afraid that telling a story about our leader being intellectually dishonest was the nuclear option. If you're slowly but surely gaining territory in a conventional war, suddenly escalating to nukes would be pointlessly destructive. This metaphor was horribly non-normative (arguing is not a punishment; carefully telling a true story about an argument is not a nuke), but I didn't know how to make it stably go away.

A more motivationally-stable compromise would be to split off whatever generalizable insights that would have been part of the story into their own posts. "Heads I Win, Tails?—Never Heard of Her" had been a huge success as far as I was concerned, and I could do more of that kind of thing, analyzing the social stuff without making it personal, even if, secretly ("secretly"), it was personal.

Ben replied that it didn't seem like it was clear to me that I was a victim of systemic abuse, and that I was trying to figure out whether I was being fair to my abusers. He thought if I could internalize that, I would be able to forgive myself a lot of messiness, which would make the problem less daunting.

I said I would bite that bullet: Yes, I was trying to figure out whether I was being fair to my abusers, and it was an important question to get right! "Other people's lack of standards harmed me, therefore I don't need to hold myself to standards in my response because I have extenuating circumstances" would be a lame excuse.

This seemed correlated with the recurring stalemated disagreement within our posse, where Michael/Ben/Jessica would say, "Fraud, if the word ever meant anything", and while I agreed that they were pointing to an important pattern of false representations optimized to move resources, I was still sympathetic to the Caliphate-defender's perspective that this usage of "fraud" was motte-and-baileying between different senses of the word. (Most people would say that the things we were alleging MIRI and CfAR had done wrong were qualitatively different from the things Enron and Bernie Madoff had done wrong.24) I wanted to do more work to formulate a more precise theory of the psychology of deception to describe exactly how things were messed up a way that wouldn't be susceptible to the motte-and-bailey charge.

Interactions With a Different Rationalist Splinter Group (November–December 2019)

On 12 and 13 November 2019, Ziz published several blog posts laying out her grievances against MIRI and CfAR. On the fifteenth, Ziz and three collaborators staged a protest at the CfAR reunion being held at a retreat center in the North Bay near Camp Meeker. A call to the police falsely alleged that the protesters had a gun, resulting in a dramatic police reaction (SWAT team called, highway closure, children's group a mile away being evacuated—the works).

I was tempted to email links to Ziz's blog posts to the Santa Rosa Press-Democrat reporter covering the incident (as part of my information-sharing-is-good virtue ethics), but decided to refrain because I predicted that Anna would prefer I didn't.

The main relevance of this incident to my Whole Dumb Story is that Ziz's memoir–manifesto posts included a 5500 word section about me. Ziz portrays me as a slave to social reality, throwing trans women under the bus to appease the forces of cissexism. I don't think that's what's going on with me, but I can see why the theory was appealing.


On 12 December 2019 I had an interesting exchange with Somni, one of the "Meeker Four"—presumably out on bail at this time?—on Discord.

I told her it was surprising that she spent so much time complaining about CfAR, Anna Salamon, Kelsey Piper, &c., but I seemed to get along fine with her—because naïvely, one would think that my views were so much worse. Was I getting a pity pass because she thought false consciousness was causing me to act against my own transfem class interests? Or what?

In order to be absolutely clear about my terrible views, I said that I was privately modeling a lot of transmisogyny complaints as something like—a certain neurotype-cluster of non-dominant male is latching onto locally ascendant social-justice ideology in which claims to victimhood can be leveraged into claims to power. Traditionally, men are moral agents, but not patients; women are moral patients, but not agents. If weird non-dominant men aren't respected if identified as such (because low-ranking males aren't valuable allies, and don't have the intrinsic moral patiency of women), but can get victimhood/moral-patiency points for identifying as oppressed transfems, that creates an incentive gradient for them to do so. No one was allowed to notice this except me, because everybody who's anybody prefers to stay on the good side of social-justice ideology unless they have Something to Protect that requires defying it.

Somni said we got along because I was being victimized by the same forces of gaslighting as her and wasn't lying about my agenda. Maybe she should be complaining about me?—but I seemed to be following a somewhat earnest epistemic process, whereas Kelsey, Scott, and Anna were not. If I were to start going, "Here's my rationality org; rule #1: no transfems (except me); rule #2, no telling people about rule #1", then she would talk about it.

I would later remark to Anna that Somni and Ziz saw themselves as being oppressed by people's hypocritical and manipulative social perceptions and behavior. Merely using the appropriate language ("Somni ... she", &c.) protected her against threats from the Political Correctness police, but it actually didn't protect against threats from the Zizians. The mere fact that I wasn't optimizing for PR (lying about my agenda, as Somni said) was what made me not a direct enemy (although still a collaborator) in their eyes.

Philosophy Blogging Interlude 2! (December 2019)

I had a pretty productive blogging spree in December 2019. In addition to a number of more minor posts on this blog and on Less Wrong, I also got out some more significant posts bearing on my agenda.

On this blog, in "Reply to Ozymandias on Fully Consensual Gender", I finally got out at least a partial reply to Ozy Brennan's June 2018 reply to "The Categories Were Made for Man to Make Predictions", affirming the relevance of an analogy Ozy had made between the socially-constructed natures of money and social gender, while denying that the analogy supported gender by self-identification. (I had been working on a more exhaustive reply, but hadn't managed to finish whittling it into a shape that I was totally happy with.)

I also polished and pulled the trigger on "On the Argumentative Form 'Super-Proton Things Tend to Come In Varieties'", my reply to Yudkowsky's implicit political concession to me back in March. I had been reluctant to post it based on an intuition of, "My childhood hero was trying to do me a favor; it would be a betrayal to reject the gift." The post itself explained why that intuition was crazy, but that just brought up more anxieties about whether the explanation constituted leaking information from private conversations—but I had chosen my words carefully such that it wasn't. ("Even if Yudkowsky doesn't know you exist [...] he's effectively doing your cause a favor" was something I could have plausibly written in the possible world where the antecedent was true.) Jessica said the post seemed good.

On Less Wrong, the mods had just announced a new end-of-year Review event, in which the best posts from the year before would be reviewed and voted on, to see which had stood the test of time and deserved to be part of our canon of cumulative knowledge. (That is, this Review period starting in late 2019 would cover posts published in 2018.)

This provided me with an affordance to write some posts critiquing posts that had been nominated for the Best-of-2018 collection that I didn't think deserved such glory. In response to "Decoupling vs. Contextualizing Norms" (which had been cited in a way that I thought obfuscatory during the "Yes Implies the Possibility of No" trainwreck), I wrote "Relevance Norms; Or, Grecian Implicature Queers the Decoupling/Contextualizing Binary", appealing to our academically standard theory of how context affects meaning to explain why "decoupling vs. contextualizing norms" is a false dichotomy.

More significantly, in reaction to Yudkowsky's "Meta-Honesty: Firming Up Honesty Around Its Edge Cases", I published "Firming Up Not-Lying Around Its Edge-Cases Is Less Broadly Useful Than One Might Initially Think",25 explaining why I thought "Meta-Honesty" was relying on an unproductively narrow sense of "honesty", because the ambiguity of natural language makes it easy to deceive people without technically lying.

I thought that one cut to the heart of the shocking behavior that we had seen from Yudkowsky lately. The "hill of meaning in defense of validity" affair had been driven by Yudkowsky's obsession with not technically lying, on two levels: he had proclaimed that asking for new pronouns "Is. Not. Lying." (as if that were the matter that anyone cared about—as if conservatives and gender-critical feminists should just pack up and go home after it had been demonstrated that trans people aren't lying), and he had seen no interest in clarifying his position on the philosophy of language, because he wasn't lying when he said that preferred pronouns weren't lies (as if that were the matter my posse cared about—as if I should keep honoring him as my caliph after it had been demonstrated that he hadn't lied). But his Sequences had articulated a higher standard than merely not-lying. If he didn't remember, I could at least hope to remind everyone else.

I also wrote a little post, "Free Speech and Triskadekaphobic Calculators", arguing that it should be easier to have a rationality/alignment community that just does systematically correct reasoning than a politically savvy community that does systematically correct reasoning except when that would taint AI safety with political drama, analogous to how it's easier to build a calculator that just does correct arithmetic, than a calculator that does correct arithmetic except that it never displays the result 13. In order to build a "triskadekaphobic calculator", you would need to "solve arithmetic" anyway, and the resulting product would be limited not only in its ability to correctly compute 6 + 7 but also the infinite family of calculations that include 13 as an intermediate result: if you can't count on (6 + 7) + 1 being the same as 6 + (7 + 1), you lose the associativity of addition.

A Newtonmas Party (December 2019)

On 20 December 2019, Scott Alexander messaged me on Discord—that I shouldn't answer if it would be unpleasant, but that he was thinking of asking about autogynephilia on the next Slate Star Codex survey, and wanted to know if I had any suggestions about question design, or if I could suggest any "intelligent and friendly opponents" to consult. After reassuring him that he shouldn't worry about answering being unpleasant ("I am actively at war with the socio-psychological forces that make people erroneously think that talking is painful!"), I referred him to my friend Tailcalled, who had a lot of experience conducting surveys and ran a "Hobbyist Sexologists" Discord server, which seemed likely to have some friendly opponents.

The next day (I assume while I still happened to be on his mind), Scott also commented on "Maybe Lying Doesn't Exist", my post from back in October replying to his "Against Lie Inflation."

I was frustrated with his reply, which I felt was not taking into account points that I had already covered in detail. A few days later, on the twenty-fourth, I succumbed to the temptation to blow up at him in the comments.

After commenting, I noticed what day it was and added a few more messages to our Discord chat—

okay, maybe speech is sometimes painful
the Less Wrong comment I just left you is really mean
and you know it's not because I don't like you
you know it's because I'm genuinely at my wit's end
after I posted it, I was like, "Wait, if I'm going to be this mean to Scott, maybe Christmas Eve isn't the best time?"
it's like the elephant in my brain is gambling that by being socially aggressive, it can force you to actually process information about philosophy which you otherwise would not have an incentive to
I hope you have a merry Christmas

And then, as an afterthought—

oh, I guess we're Jewish
that attenuates the "is a hugely inappropriately socially-aggressive blog comment going to ruin someone's Christmas" fear somewhat

Scott messaged back at 11:08 the next morning, Christmas Day. He explained that the thought process behind his comment was that he still wasn't sure where we disagreed and didn't know how to proceed except to dump his understanding of the philosophy (which would include things I already knew) and hope that I could point to the step I didn't like. He didn't know how to convince me of his sincerity and rebut my accusations of him motivatedly playing dumb (which he was inclined to attribute to the malign influence of Michael Vassar's gang).

I explained that the reason for those accusations was that I knew he knew about strategic equivocation, because he taught everyone else about it (as in his famous posts about the motte-and-bailey doctrine and the noncentral fallacy). And so when he acted like he didn't get it when I pointed out that this also applied to "trans women are women", that just seemed implausible.

He asked for a specific example. ("Trans women are women, therefore trans women have uteruses" being a bad example, because no one was claiming that.) I quoted an article from the The Nation: "There is another argument against allowing trans athletes to compete with cis-gender athletes that suggests that their presence hurts cis-women and cis-girls. But this line of thought doesn't acknowledge that trans women are in fact women." Scott agreed that this was stupid and wrong and a natural consequence of letting people use language the way he was suggesting (!).

I didn't think it was fair to ordinary people to expect them to go as deep into the philosophy-of-language weeds as I could before being allowed to object to this kind of chicanery. I thought "pragmatic" reasons to not just use the natural clustering that you would get by impartially running a clustering algorithm on the subspace of configuration space relevant to your goals, basically amounted to "wireheading" (optimizing someone's map for looking good rather than reflecting the territory) or "war" (optimizing someone's map to not reflect the territory in order to manipulate them). If I were to transition today and didn't pass as well as Jessica, and everyone felt obligated to call me a woman, they would be wireheading me: making me think my transition was successful, even though it wasn't. That's not a nice thing to do to a rationalist.

Scott thought that trans people had some weird thing going on in their brains such that being referred to as their natal sex was intrinsically painful, like an electric shock. The thing wasn't an agent, so the injunction to refuse to give in to extortion didn't apply. Having to use a word other than the one you would normally use in order to avoid subjecting someone to painful electric shocks was worth it.

I thought I knew things about the etiology of transness such that I didn't think the electric shock was inevitable, but I didn't want the conversation to go there if it didn't have to. I didn't have to ragequit the so-called "rationalist" community over a complicated empirical question, only over bad philosophy. Scott said he might agree with me if he thought the tradeoff were unfavorable between clarity and utilitarian benefit—or if he thought it had the chance of snowballing like in his "Kolmogorov Complicity and the Parable of Lightning".

I pointed out that what sex people are is more relevant to human social life than whether lightning comes before thunder. He said that the problem in his parable was that people were being made ignorant of things, whereas in the transgender case, no one was being kept ignorant; their thoughts were just following a longer path.

I was skeptical of the claim that no one was "really" being kept ignorant. If you're sufficiently clever and careful and you remember how language worked when Airstrip One was still Britain, then you can still think, internally, and express yourself as best you can in Newspeak. But a culture in which Newspeak is mandatory, and all of Oceania's best philosophers have clever arguments for why Newspeak doesn't distort people's beliefs doesn't seem like a culture that could solve AI alignment.

I linked to Zvi Mowshowitz's post about how the claim that "everybody knows" something gets used to silence people trying to point out the thing: in this case, basically, "'Everybody knows' our kind of trans women are sampled from (part of) the male multivariate trait distribution rather than the female multivariate trait distribution, why are you being a jerk and pointing this out?" But I didn't think that everyone knew.26 I thought the people who sort-of knew were being intimidated into doublethinking around it.

At this point, it was almost 2 p.m. (the paragraphs above summarizing a larger volume of typing), and Scott mentioned that he wanted to go to the Event Horizon Christmas party, and asked if I wanted to come and continue the discussion there. I assented, and thanked him for his time; it would be really exciting if we could avoid a rationalist civil war.

When I arrived at the party, people were doing a reading of the "Hero Licensing" dialogue epilogue to Inadequate Equilibria, with Yudkowsky himself playing the Mysterious Stranger. At some point, Scott and I retreated upstairs to continue our discussion. By the end of it, I was feeling more assured of Scott's sincerity, if not his competence. Scott said he would edit in a disclaimer note at the end of "... Not Man for the Categories".

It would have been interesting if I also got the chance to talk to Yudkowsky for a few minutes, but if I did, I wouldn't be allowed to recount any details of that here due to the privacy rules I'm following.

The rest of the party was nice. People were reading funny GPT-2 quotes from their phones. At one point, conversation happened to zag in a way that let me show off the probability fact I had learned during Math and Wellness Month. A MIRI researcher sympathetically told me that it would be sad if I had to leave the Bay Area, which I thought was nice. There was nothing about the immediate conversational context to suggest that I might have to leave the Bay, but I guess by this point, my existence had become a context.

All in all, I was feeling less ragequitty about the rationalists27 after the party—as if by credibly threatening to ragequit, the elephant in my brain had managed to extort more bandwidth from our leadership. The note Scott added to the end of "... Not Man for the Categories" still betrayed some philosophical confusion, but I now felt hopeful about addressing that in a future blog post explaining my thesis that unnatural category boundaries were for "wireheading" or "war".

It was around this time that someone told me that I wasn't adequately taking into account that Yudkowsky was "playing on a different chessboard" than me. (A public figure focused on reducing existential risk from artificial general intelligence is going to sense different trade-offs around Kolmogorov complicity strategies than an ordinary programmer or mere worm focused on things that don't matter.) No doubt. But at the same time, I thought Yudkowsky wasn't adequately taking into account the extent to which some of his longtime supporters (like Michael or Jessica) were, or had been, counting on him to uphold certain standards of discourse (rather than chess)?

Another effect of my feeling better after the party was that my motivation to keep working on my memoir of the Category War vanished—as if I was still putting weight on a zero-sum frame in which the memoir was a nuke that I only wanted to use as an absolute last resort.

Ben wrote (Subject: "Re: state of Church leadership"):

It seems to [me] that according to Zack's own account, even writing the memoir privately feels like an act of war that he'd rather avoid, not just using his own territory as he sees fit to create internal clarity around a thing.

I think this has to mean either
(a) that Zack isn't on the side of clarity except pragmatically where that helps him get his particular story around gender and rationalism validated
or
(b) that Zack has ceded the territory of the interior of his own mind to the forces of anticlarity, not for reasons, but just because he's let the anticlaritarians dominate his frame.

Or, I pointed out, (c) I had ceded the territory of the interior of my own mind to Eliezer Yudkowsky in particular, and while I had made a lot of progress unwinding this, I was still, still not done, and seeing him at the Newtonmas party set me back a bit.

"Riley" reassured me that finishing the memoir privately would be clarifying and cathartic for me. If people in the Caliphate came to their senses, I could either not publish it, or give it a happy ending where everyone comes to their senses.

(It does not have a happy ending where everyone comes to their senses.)

Further Discourses on What the Categories Were Made For (January–February 2020)

Michael told me he had changed his mind about gender and the philosophy of language. We talked about it on the phone. He said that the philosophy articulated in "A Human's Guide to Words" was inadequate for politicized environments where our choice of ontology is constrained. If we didn't know how to coin a new third gender, or teach everyone the language of "clusters in high-dimensional configuration space," our actual choices for how to think about trans women were basically three: creepy men (the TERF narrative), crazy men (the medical model), or a protected class of actual woman.28

According to Michael, while "trans women are real women" was a lie (in the sense that he agreed that me and Jessica and Ziz were not part of the natural cluster of biological females), it was also the case that "trans women are not real women" was a lie (in the sense that the "creepy men" and "crazy men" stories were wrong). "Trans women are women" could be true in the sense that truth is about processes that create true maps, such that we can choose the concepts that allow discourse and information flow. If the "creepy men" and "crazy men" stories are a cause of silencing, then—under present conditions—we had to choose the "protected class" story in order for people like Ziz to not be silenced.

My response (more vehemently when thinking on it a few hours later) was that this was a garbage bullshit appeal to consequences. If I wasn't going to let Ray Arnold get away with "we are better at seeking truth when people feel safe," I shouldn't let Michael get away with "we are better at seeking truth when people aren't oppressed." Maybe the wider world was ontology-constrained to those three choices, but I was aspiring to higher nuance in my writing.

"Thanks for being principled," he replied.


On 10 February 2020, Scott Alexander published "Autogenderphilia Is Common and Not Especially Related to Transgender", an analysis of the results of the autogynephilia/autoandrophilia questions on the recent Slate Star Codex survey. Based on eyeballing the survey data, Alexander proposed "if you identify as a gender, and you're attracted to that gender, it's a natural leap to be attracted to yourself being that gender" as a "very boring" theory.

I appreciated the endeavor of getting real data, but I was unimpressed with Alexander's analysis for reasons that I found difficult to write up in a timely manner; I've only just recently gotten around to polishing my draft and throwing it up as a standalone post. Briefly, I can see how it looks like a natural leap if you're verbally reasoning about "gender", but on my worldview, a hypothesis that puts "gay people (cis and trans)" in the antecedent is not boring and takes on a big complexity penalty, because that group is heterogeneous with respect to the underlying mechanisms of sexuality. I already don't have much use for "if you are a sex, and you're attracted to that sex" as a category of analytical interest, because I think gay men and lesbians are different things that need to be studied separately. Given that, "if you identify as a gender, and you're attracted to that gender" (with respect to "gender", not sex) comes off even worse: it's grouping together lesbians, and gay men, and heterosexual males with a female gender identity, and heterosexual females with a male gender identity. What causal mechanism could that correspond to?

(I do like the hypernym autogenderphilia.)

A Private Document About a Disturbing Hypothesis (early 2020)

There's another extremely important part of the story that would fit around here chronologically, but I again find myself constrained by privacy norms: everyone's common sense of decency (this time, even including my own) screams that it's not my story to tell.

Adherence to norms is fundamentally fraught for the same reason AI alignment is. In rich domains, attempts to regulate behavior with explicit constraints face a lot of adversarial pressure from optimizers bumping up against the constraint and finding the nearest unblocked strategies that circumvent it. The intent of privacy norms is to conceal information. But information in Shannon's sense is about what states of the world can be inferred given the states of communication signals; it's much more expansive than the denotative meaning of a text.

If norms can only regulate the denotative meaning of a text (because trying to regulate subtext is too subjective for a norm-enforcing coalition to coordinate on), someone who would prefer to reveal private information but also wants to comply with privacy norms has an incentive to leak everything they possibly can as subtext—to imply it, and hope to escape punishment on grounds of not having "really said it." And if there's some sufficiently egregious letter-complying-but-spirit-violating evasion of the norm that a coalition can coordinate on enforcing, the whistleblower has an incentive to stay only just shy of being that egregious.

Thus, it's unclear how much mere adherence to norms helps, when people's wills are actually misaligned. If I'm furious at Yudkowsky for prevaricating about my Something to Protect, and am in fact more furious rather than less that he managed to do it without violating the norm against lying, I should not be so foolish as to think myself innocent and beyond reproach for not having "really said it."

Having considered all this, I want to tell you about how I spent a number of hours from early May 2020 to early July 2020 working on a private Document about a disturbing hypothesis that had occurred to me earlier that year.

Previously, I had already thought it was nuts that trans ideology was exerting influence on the rearing of gender-non-conforming children—that is, children who are far outside the typical norm of behavior for their sex: very tomboyish girls and very effeminate boys.

Under recent historical conditions in the West, these kids were mostly "pre-gay" rather than trans. (The stereotype about lesbians being masculine and gay men being feminine is, like most stereotypes, basically true: sex-atypical childhood behavior between gay and straight adults has been meta-analyzed at Cohen's d ≈ 1.31 standard deviations for men and d ≈ 0.96 for women.) A solid majority of children diagnosed with gender dysphoria ended up growing out of it by puberty. In the culture of the current year, it seemed likely that a lot of those kids would instead get affirmed into a cross-sex identity at a young age, even though most of them would have otherwise (under a "watchful waiting" protocol) grown up to be ordinary gay men and lesbians.

What made this shift in norms crazy, in my view, was not just that transitioning younger children is a dubious treatment decision, but that it's a dubious treatment decision that was being made on the basis of the obvious falsehood that "trans" was one thing: the cultural phenomenon of "trans kids" was being used to legitimize trans adults, even though a supermajority of trans adults were in the late-onset taxon and therefore had never resembled these HSTS-taxon kids. That is: pre-gay kids in our Society are being sterilized in order to affirm the narcissistic delusions29 of guys like me.

That much was obvious to anyone who's had their Blanchardian enlightenment, and wouldn't have been worth the effort of writing a special private Document about. The disturbing hypothesis that occurred to me in early 2020 was that, in the culture of the current year, affirmation of a cross-sex identity might happen to kids who weren't HSTS-taxon at all.

Very small children who are just learning what words mean say a lot of things that aren't true (I'm a grown-up; I'm a cat; I'm a dragon), and grownups tend to play along in the moment as a fantasy game, but they don't coordinate to make that the permanent new social reality.

But if the grown-ups have been trained to believe that "trans kids know who they are"—if they're emotionally eager at the prospect of having a transgender child, or fearful of the damage they might do by not affirming—they might selectively attend to confirming evidence that the child "is trans", selectively ignore contrary evidence that the child "is cis", and end up reinforcing a cross-sex identity that would not have existed if not for their belief in it—a belief that the same people raising the same child ten years ago wouldn't have held. (A September 2013 article in The Atlantic by the father of a male child with stereotypically feminine interests was titled "My Son Wears Dresses; Get Over It", not "My Daughter Is Trans; Get Over It".)

Crucially, if gender identity isn't an innate feature of toddler psychology, the child has no way to know anything is "wrong." If none of the grown-ups can say, "You're a boy because boys are the ones with penises" (because that's not what nice smart liberal people are supposed to believe in the current year), how is the child supposed to figure that out independently? Toddlers are not very sexually dimorphic, but sex differences in play style and social behavior tend to emerge within a few years. There were no cars in the environment of evolutionary adaptedness, and yet the effect size of the sex difference in preference for toy vehicles is a massive d ≈ 2.44, about one and a half times the size of the sex difference in adult height.

(I'm going with the MtF case without too much loss of generality; I don't think the egregore is quite as eager to transition females at this age, but the dynamics are probably similar.)

What happens when the kid develops a self-identity as a girl, only to find out, potentially years later, that she noticeably doesn't fit in with the (cis) girls on the many occasions that no one has explicitly spelled out in advance where people are using "gender" (perceived sex) to make a prediction or decision?

Some might protest, "But what's the harm? She can always change her mind later if she decides she's actually a boy." I don't doubt that if the child were to clearly and distinctly insist, "I'm definitely a boy," the nice smart liberal grown-ups would unhesitatingly accept that.

But the harm I'm theorizing is not that the child has an intrinsic male identity that requires recognition. (What is an "identity", apart from the ordinary factual belief that one is of a particular sex?) Rather, the concern is that social transition prompts everyone, including the child themself, to use their mental models of girls (juvenile female humans) to make (mostly subconscious rather than deliberative) predictions and decisions about the child, which will be a systematically worse statistical fit than their models of boys (juvenile male humans), because the child is, in fact, a boy (juvenile male human), and those miscalibrated predictions and decisions will make the child's life worse in a complicated, illegible way that doesn't necessarily result in the child spontaneously asserting, "I prefer that you call me a boy" against the current of everyone in the child's life having accepted otherwise for as long the kid can remember.

Scott Alexander has written about how concept-shaped holes can be impossible to notice. In a culture whose civic religion celebrates being trans and denies that gender has truth conditions other than the individual's say-so, there are concept-shaped holes that would make it hard for a kid to notice the hypothesis "I'm having a systematically worse childhood than I otherwise would have because all the grown-ups in my life have agreed I was a girl since I was three years old, even though all of my actual traits are sampled from the joint distribution for juvenile male humans, not juvenile female humans."

The epistemic difficulties extend to the grown-ups as well. I think people who are familiar with the relevant scientific literature or come from an older generation will find the story I've laid out above pretty compelling, but the parents are likely to be unmoved. They know they didn't coach the child to claim to be a girl. On what grounds could a stranger who wasn't there (or a skeptical family friend who sees the kid maybe once a month) assert that subconscious influence must be at work?

In the early twentieth century, a German schoolteacher named Wilhelm von Osten claimed to have taught his horse, Clever Hans, to do arithmetic and other intellectual feats. One could ask, "How much is 2/5 plus 1/2?" and the stallion would first stomp his hoof nine times, and then ten times—representing 9/10ths, the correct answer. An investigation concluded that no deliberate trickery was involved: Hans could often give the correct answer when questioned by a stranger, demonstrating that von Osten couldn't be secretly signaling the horse when to stop stomping. But further careful experiments by Oskar Pfungst revealed that Hans was picking up on unconscious cues "leaked" by the questioner's body language as the number of stomps approached the correct answer: for instance, Hans couldn't answer questions that the questioner themself didn't know.30

Notably, von Osten didn't accept Pfungst's explanation, continuing to believe that his intensive tutoring had succeeded in teaching the horse arithmetic.

It's hard to blame him, really. He had spent more time with Hans than anyone else. Hans observably could stomp out the correct answers to questions. Absent an irrational prejudice against the idea that a horse could learn arithmetic, why should he trust Pfungst's nitpicky experiments over the plain facts of his own intimately lived experience?

But what was in question wasn't the observations of Hans's performance, only the interpretation of what those observations implied about Hans's psychology. As Pfungst put it: "that was looked for in the animal which should have been sought in the man."

Similarly, in the case of a reputedly transgender three-year-old, a skeptical family friend isn't questioning observations of what the child said, only the interpretation of what those observations imply about the child's psychology. From the family's perspective, the evidence is clear: the child claimed to be a girl on many occasions over a period of months, and expressed sadness about being a boy. Absent an irrational prejudice against the idea that a child could be transgender, what could make them doubt the obvious interpretation of their own intimately lived experience?

From the skeptical family friend's perspective, there are a number of anomalies that cast serious doubt on what the family thinks is the obvious interpretation.

(Or so I'm imagining how this might go, hypothetically. The following illustrative vignettes may not reflect real events.)

For one thing, there may be clues that the child's information environment did not provide instruction on some of the relevant facts. Suppose that, six months before the child's social transition went down, another family friend had explained to the child that "Some people don't have penises." (Nice smart liberal grown-ups in the current year don't feel the need to be more specific.) Growing up in such a culture, the child's initial gender statements may reflect mere confusion rather than a deep-set need—and later statements may reflect social reinforcement of earlier confusion. Suppose that after social transition, the same friend explained to the child, "When you were little, you couldn't talk, so your parents had to guess whether you were a boy or a girl based on your parts." While this claim does convey the lesson that there's a customary default relationship between gender and genitals (in case that hadn't been clear before), it also reinforces the idea that the child is transgender.

For another thing, from the skeptical family friend's perspective, it's striking how the family and other grown-ups in the child's life seem to treat the child's statements about gender starkly differently than the child's statements about everything else.

Imagine that, around the time of the social transition, the child responded to "Hey kiddo, I love you" with, "I'm a girl and I'm a vegetarian." In the skeptic's view, both halves of that sentence were probably generated by the same cognitive algorithm—something like, "practice language and be cute to caregivers, making use of themes from the local cultural environment" (of nice smart liberal grown-ups who talk a lot about gender and animal welfare). In the skeptic's view, if you're not going to change the kid's diet on the basis of the second part, you shouldn't social transition the kid on the basis of the first part.

Perhaps even more striking is the way that the grown-ups seem to interpret the child's conflicting or ambiguous statements about gender. Imagine that, around the time social transition was being considered, a parent asked the child whether the child would prefer to be addressed as "my son" or "my daughter."

Suppose the child replied, "My son. Or you can call me she. Everyone should call me she or her or my son."

The grown-ups seem to mostly interpret exchanges like this as indicating that while the child is trans, she's confused about the gender of the words "son" and "daughter". They don't seem to pay much attention to the competing hypothesis that the child knows he's his parents "son", but is confused about the implications of she/her pronouns.

It's not hard to imagine how differential treatment by grown-ups of gender-related utterances could unintentionally shape outcomes. This may be clearer if we imagine a non-gender case. Suppose the child's father's name is John Smith, and that after a grown-up explains "Sr."/"Jr." generational suffixes after it happened to come up in fiction, the child declares that his name is John Smith, Jr. now. Caregivers are likely to treat this as just a cute thing that the kid said, quickly forgotten by all. But if caregivers feared causing psychological harm by denying a declared name change, one could imagine them taking the child's statement as a prompt to ask followup questions. ("Oh, would you like me to call you John or John Jr., or just Junior?") With enough followup, it seems plausible that a name change to "John Jr." would meet with the child's assent and "stick" socially. The initial suggestion would have come from the child, but most of the optimization—the selection that this particular statement should be taken literally and reinforced as a social identity, while others are just treated as a cute but not overly meaningful thing the kid said—would have come from the adults.

Finally, there is the matter of the child's behavior and personality. Suppose that, around the same time that the child's social transition was going down, a parent reported the child being captivated by seeing a forklift at Costco. A few months later, another family friend remarked that maybe the child is very competitive, and that "she likes fighting so much because it's the main thing she knows of that you can win."

I think people who are familiar with the relevant scientific literature or come from an older generation would look at observations like these and say, Well, yes, he's a boy; boys like vehicles (d ≈ 2.44!) and boys like fighting. Some of them might suggest that these observations should be counterindicators for transition—that the cross-gender verbal self-reports are less decision-relevant than the fact of a male child behaving in male-typical ways. But nice smart liberal grown-ups in the current year don't think that way.

One might imagine that the inferential distance between nice smart liberal grown-ups and people from an older generation (or a skeptical family friend) might be crossed by talking about it, but it turns out that talking doesn't help much when people have radically different priors and interpret the same evidence differently.

Imagine a skeptical family friend wondering (about four months after the social transition) what "being a girl" means to the child. How did the kid know?

A parent obliges to ask the child: "Hey kiddo, somebody wants to know how you know that you are a girl."

"Why?"

"He's interested in that kind of thing."

"I know that I'm a girl because girls like specific things like rainbows and I like rainbows so I'm a girl."

"Is that how you knew in the first place?"

"Yeah."

"You know there are a lot of boys who like rainbows."

"I don't think boys like rainbows so well—oh hey! Here this ball is!"

(When recounting this conversation, the parent adds that rainbows hadn't come up before, and that the child was looking at a rainbow-patterned item at the time of answering.)

It would seem that the interpretation of this kind of evidence depends on one's prior convictions. If you think that transition is a radical intervention that might pass a cost–benefit analysis for treating rare cases of intractable sex dysphoria, answers like "because girls like specific things like rainbows" are disqualifying. (A fourteen-year-old who could read an informed-consent form would be able to give a more compelling explanation than that, but a three-year-old just isn't ready to make this kind of decision.) Whereas if you think that some children have a gender that doesn't match their assigned sex at birth, you might expect them to express that affinity at age three, without yet having the cognitive or verbal abilities to explain it. Teasing apart where these two views make different predictions seems like it should be possible, but might be beside the point, if the real crux is over what categories are made for. (Is sex an objective fact that sometimes merits social recognition, or is it better to live in a Society where people are free to choose the gender that suits them?)

Anyway, that's just a hypothesis that occurred to me in early 2020, about something that could happen in the culture of the current year, hypothetically, as far as I know. I'm not a parent and I'm not an expert on child development. And even if the "Clever Hans" etiological pathway I conjectured is real, the extent to which it might apply to any particular case is complex; you could imagine a kid who was "actually trans" whose social transition merely happened earlier than it otherwise would have due to these dynamics.

For some reason, it seemed important that I draft a Document about it with lots of citations to send to a few friends. I thought about cleaning it up and publishing it as a public blog post (working title: "Trans Kids on the Margin; and, Harms from Misleading Training Data"), but for some reason, that didn't seem as pressing.

I put an epigraph at the top:

If you love someone, tell them the truth.

—Anonymous

Given that I spent so many hours on this little research and writing project in May–July 2020, I think it makes sense for me to mention it at this point in my memoir, where it fits in chronologically. I have an inalienable right to talk about my own research interests, and talking about my own research interests obviously doesn't violate any norm against leaking private information about someone else's family, or criticizing someone else's parenting decisions.

The New York Times Pounces (June 2020)

On 1 June 2020, I received a Twitter DM from New York Times reporter Cade Metz, who said he was "exploring a story about the intersection of the rationality community and Silicon Valley." I sent him an email saying that I would be happy to talk but that had been pretty disappointed with the community lately: I was worried that the social pressures of trying to be a "community" and protect the group's status (e.g., from New York Times reporters who might portray us in an unflattering light?) might incentivize people to compromise on the ideals of systematically correct reasoning that made the community valuable in the first place.

He never got back to me. Three weeks later, all existing Slate Star Codex posts were taken down. A lone post on the main page explained that the New York Times piece was going to reveal Alexander's real last name and he was taking his posts down as a defensive measure. (No blog, no story?) I wrote a script (slate_starchive.py) to replace the Slate Star Codex links on this blog with links to the most recent Internet Archive copy.

Philosophy Blogging Interlude 3! (mid-2020)

I continued my philosophy of language work, looking into the academic literature on formal models of communication and deception. I wrote a couple posts encapsulating what I learned from that—and I continued work on my "advanced" philosophy of categorization thesis, the sequel to "Where to Draw the Boundaries?"

The disclaimer note that Scott Alexander had appended to "... Not Man for the Categories" after our Christmas 2019 discussion had said:

I had hoped that the Israel/Palestine example above made it clear that you have to deal with the consequences of your definitions, which can include confusion, muddling communication, and leaving openings for deceptive rhetorical strategies.

This is certainly an improvement over the original text without the note, but I took the use of the national borders metaphor to mean that Scott still hadn't gotten my point about there being laws of thought underlying categorization: mathematical principles governing how choices of definition can muddle communication or be deceptive. (But that wasn't surprising; by Scott's own admission, he's not a math guy.)

Category "boundaries" are a useful visual metaphor for explaining the cognitive function of categorization: you imagine a "boundary" in configuration space containing all the things that belong to the category.

If you have the visual metaphor, but you don't have the math, you might think that there's nothing intrinsically wrong with squiggly or discontinuous category "boundaries", just as there's nothing intrinsically wrong with Alaska not being part of the contiguous United States. It may be inconvenient that you can't drive from Alaska to Washington without going through Canada, but it's not wrong that the borders are drawn that way: Alaska really is governed by the United States.

But if you do have the math, a moment of introspection will convince you that the analogy between category "boundaries" and national borders is shallow.

A two-dimensional political map tells you which areas of the Earth's surface are under the jurisdiction of which government. In contrast, category "boundaries" tell you which regions of very high-dimensional configuration space correspond to a word/concept, which is useful because that structure can be used to make probabilistic inferences. You can use your observations of some aspects of an entity (some of the coordinates of a point in configuration space) to infer category-membership, and then use category membership to make predictions about aspects that you haven't yet observed.

But the trick only works to the extent that the category is a regular, non-squiggly region of configuration space: if you know that egg-shaped objects tend to be blue, and you see a black-and-white photo of an egg-shaped object, you can get close to picking out its color on a color wheel. But if egg-shaped objects tend to blue or green or red or gray, you wouldn't know where to point to on the color wheel.

The analogous algorithm applied to national borders on a political map would be to observe the longitude of a place, use that to guess what country the place is in, and then use the country to guess the latitude—which isn't typically what people do with maps. Category "boundaries" and national borders might both be illustrated similarly in a two-dimensional diagram, but philosophically, they're different entities. The fact that Scott Alexander was appealing to national borders to defend gerrymandered categories, suggested that he didn't understand this.

I still had some deeper philosophical problems to resolve, though. If squiggly categories were less useful for inference, why would someone want a squiggly category boundary? Someone who said, "Ah, but I assign higher utility to doing it this way" had to be messing with you. Squiggly boundaries were less useful for inference; the only reason you would realistically want to use them would be to commit fraud, to pass off pyrite as gold by redefining the word "gold".

That was my intuition. To formalize it, I wanted some sensible numerical quantity that would be maximized by using "nice" categories and get trashed by gerrymandering. Mutual information was the obvious first guess, but that wasn't it, because mutual information lacks a "topology", a notion of "closeness" that would make some false predictions better than others by virtue of being "close".

Suppose the outcome space of X is {H, T} and the outcome space of Y is {1, 2, 3, 4, 5, 6, 7, 8}. I wanted to say that if observing X=H concentrates Y's probability mass on {1, 2, 3}, that's more useful than if it concentrates Y on {1, 5, 8}. But that would require the numerals in Y to be numbers rather than opaque labels; as far as elementary information theory was concerned, mapping eight states to three states reduced the entropy from log2 8 = 3 to log2 3 ≈ 1.58 no matter which three states they were.

How could I make this rigorous? Did I want to be talking about the variance of my features conditional on category membership? Was "connectedness" what I wanted, or was it only important because it cut down the number of possibilities? (There are 8!/(6!2!) = 28 ways to choose two elements from {1..8}, but only 7 ways to choose two contiguous elements.) I thought connectedness was intrinsically important, because we didn't just want few things, we wanted things that are similar enough to make similar decisions about.

I put the question to a few friends in July 2020 (Subject: "rubber duck philosophy"), and Jessica said that my identification of the variance as the key quantity sounded right: it amounted to the expected squared error of someone trying to guess the values of the features given the category. It was okay that this wasn't a purely information-theoretic criterion, because for problems involving guessing a numeric quantity, bits that get you closer to the right answer were more valuable than bits that didn't.

A Couple of Impulsive Emails (September 2020)

I decided on "Unnatural Categories Are Optimized for Deception" as the title for my advanced categorization thesis. Writing it up was a major undertaking. There were a lot of nuances to address and potential objections to preëmpt, and I felt that I had to cover everything. (A reasonable person who wanted to understand the main ideas wouldn't need so much detail, but I wasn't up against reasonable people who wanted to understand.)

In September 2020, Yudkowsky Tweeted something about social media incentives prompting people to make nonsense arguments, and something in me boiled over. The Tweets were fine in isolation, but I rankled at it given the absurdly disproportionate efforts I was undertaking to unwind his incentive-driven nonsense. I left a snarky, pleading reply and vented on my own timeline (with preview images from the draft of "Unnatural Categories Are Optimized for Deception"):

Who would have thought getting @ESYudkowsky's robot cult to stop trying to trick me into cutting my dick off (independently of the empirical facts determining whether or not I should cut my dick off) would involve so much math?? OK, I guess the math part isn't surprising, but—31

My rage-boil continued into staying up late writing him an angry email, which I mostly reproduce below (with a few redactions for either brevity or compliance with privacy norms, but I'm not going to clarify which).

To: Eliezer Yudkowsky <[redacted]>
Cc: Anna Salamon <[redacted]>
Date: Sunday 13 September 2020 2:24 a.m.
Subject: out of patience

"I could beg you to do it in order to save me. I could beg you to do it in order to avert a national disaster. But I won't. These may not be valid reasons. There is only one reason: you must say it, because it is true."
Atlas Shrugged by Ayn Rand

Dear Eliezer (cc Anna as mediator):

Sorry, I'm getting really really impatient (maybe you saw my impulsive Tweet-replies today; and I impulsively called Anna today; and I've spent the last few hours drafting an even more impulsive hysterical-and-shouty potential Less Wrong post; but now I'm impulsively deciding to email you in the hopes that I can withhold the hysterical-and-shouty post in favor of a lower-drama option of your choice): is there any way we can resolve the categories dispute in public?! Not any object-level gender stuff which you don't and shouldn't care about, just the philosophy-of-language part.

My grievance against you is very simple. You are on the public record claiming that:

you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning.

I claim that this is false. I think I am standing in defense of truth when I insist on a word, brought explicitly into question, being used with some particular meaning, when I have an argument for why my preferred usage does a better job of "carving reality at the joints" and the one bringing my usage into question doesn't have such an argument. And in particular, "This word usage makes me sad" doesn't count as a relevant argument. I agree that words don't have intrinsic ontologically-basic meanings, but precisely because words don't have intrinsic ontologically-basic meanings, there's no reason to challenge someone's word usage except because of the hidden probabilistic inference it embodies.

Imagine one day David Gerard of /r/SneerClub said, "Eliezer Yudkowsky is a white supremacist!" And you replied: "No, I'm not! That's a lie." And imagine E.T. Jaynes was still alive and piped up, "You are ontologcially confused if you think that's a false assertion. You're not standing in defense of truth if you insist on words, such white supremacist, brought explicitly into question, being used with some particular meaning." Suppose you emailed Jaynes about it, and he brushed you off with, "But I didn't say you were a white supremacist; I was only targeting a narrow ontology error." In this hypothetical situation, I think you might be pretty upset—perhaps upset enough to form a twenty-one month grudge against someone whom you used to idolize?

I agree that pronouns don't have the same function as ordinary nouns. However, in the English language as actually spoken by native speakers, I think that gender pronouns do have effective "truth conditions" as a matter of cognitive science. If someone said, "Come meet me and my friend at the mall; she's really cool and you'll like her", and then that friend turned out to look like me, you would be surprised.

I don't see the substantive difference between "You're not standing in defense of truth (...)" and "I can define a word any way I want." [...]

[...]

As far as your public output is concerned, it looks like you either changed your mind about how the philosophy of language works, or you think gender is somehow an exception. If you didn't change your mind, and you don't think gender is somehow an exception, is there some way we can get that on the public record somewhere?!

As an example of such a "somewhere", I had asked you for a comment on my explanation, "Where to Draw the Boundaries?" (with non-politically-hazardous examples about dolphins and job titles) [...] I asked for a comment from Anna, and at first she said that she would need to "red team" it first (because of the political context), and later she said that she was having difficulty for other reasons. Okay, the clarification doesn't have to be on my post. I don't care about credit! I don't care whether or not anyone is sorry! I just need this trivial thing settled in public so that I can stop being in pain and move on with my life.

As I mentioned in my Tweets today, I have a longer and better explanation than "... Boundaries?" mostly drafted. (It's actually somewhat interesting; the logarithmic score doesn't work as a measure of category-system goodness because it can only reward you for the probability you assign to the exact answer, but we want "partial credit" for almost-right answers, so the expected squared error is actually better here, contrary to what you said in the "Technical Explanation" about what Bayesian statisticians do). [...]

The only thing I've been trying to do for the past twenty-one months is make this simple thing established "rationalist" knowledge:

(1) For all nouns N, you can't define N any way you want, for at least 37 reasons.

(2) Woman is such a noun.

(3) Therefore, you can't define the word woman any way you want.

(Note, this is totally compatible with the claim that trans women are women, and trans men are men, and nonbinary people are nonbinary! It's just that you have to argue for why those categorizations make sense in the context you're using the word, rather than merely asserting it with an appeal to arbitrariness.)

This is literally modus ponens. I don't understand how you expect people to trust you to save the world with a research community that literally cannot perform modus ponens.

[...] See, I thought you were playing on the chessboard of being correct about rationality. Such that, if you accidentally mislead people about your own philosophy of language, you could just ... issue a clarification? I and Michael and Ben and Sarah and ["Riley"] and Jessica wrote to you about this and explained the problem in painstaking detail, and you stonewalled us. Why? Why is this so hard?!

[...]

No. The thing that's been driving me nuts for twenty-one months is that I expected Eliezer Yudkowsky to tell the truth. I remain,

Your heartbroken student,
Zack M. Davis

I followed it with another email after I woke up the next morning:

To: Eliezer Yudkowsky <[redacted]>
Cc: Anna Salamon <[redacted]>
Date: Sunday 13 September 2020 11:02 a.m.
Subject: Re: out of patience

[...] The sinful and corrupted part wasn't the initial Tweets; the sinful and corrupted part is this bullshit stonewalling when your Twitter followers and me and Michael and Ben and Sarah and ["Riley"] and Jessica tried to point out the problem. I've never been arguing against your private universe [...]; the thing I'm arguing against in "Where to Draw the Boundaries?" (and my unfinished draft sequel, although that's more focused on what Scott wrote) is the actual text you actually published, not your private universe.

[...] you could just publicly clarify your position on the philosophy of language the way an intellectually-honest person would do if they wanted their followers to have correct beliefs about the philosophy of language?!

You wrote:

Using language in a way you dislike, openly and explicitly and with public focus on the language and its meaning, is not lying.

Now, maybe as a matter of policy, you want to make a case for language being used a certain way. Well, that's a separate debate then. But you're not making a stand for Truth in doing so, and your opponents aren't tricking anyone or trying to.

The problem with "it's a policy debate about how to use language" is that it completely elides the issue that some ways of using language perform better at communicating information, such that attempts to define new words or new senses of existing words should come with a justification for why the new sense is useful for conveying information, and that is a matter of Truth. Without such a justification, it's hard to see why you would want to redefine a word except to mislead people with strategic equivocation.

It is literally true that Eliezer Yudkowsky is a white supremacist (if I'm allowed to define "white supremacist" to include "someone who once linked to the 'Race and intelligence' Wikipedia page in a context that implied that it's an empirical question").

It is literally true that 2 + 2 = 6 (if I'm allowed to define '2' as •••-many).

You wrote:

The more technology advances, the further we can move people towards where they say they want to be in sexspace. Having said this we've said all the facts.

That's kind of like defining Solomonoff induction, and then saying, "Having said this, we've built AGI." No, you haven't said all the facts! Configuration space is very high-dimensional; we don't have access to the individual points. Trying to specify the individual points ("say all the facts") would be like what you wrote about in "Empty Labels"—"not just that I can vary the label, but that I can get along just fine without any label at all." Since that's not possible, we need to group points into the space together so that we can use observations from the coordinates that we have observed to make probabilistic inferences about the coordinates we haven't. But there are mathematical laws governing how well different groupings perform, and those laws are a matter of Truth, not a mere policy debate.

[...]

But if behavior at equilibrium isn't deceptive, there's just no such thing as deception; I wrote about this on Less Wrong in "Maybe Lying Can't Exist?!" (drawing on the academic literature about sender–receiver games). I don't think you actually want to bite that bullet?

In terms of information transfer, there is an isomorphism between saying "I reserve the right to lie 5% of the time about whether something is a member of category C" and adopting a new definition of C that misclassifies 5% of instances with respect to the old definition.

Like, I get that you're ostensibly supposed to be saving the world and you don't want randos yelling at you in your email about philosophy. But I thought the idea was that we were going to save the world by means of doing unusually clear thinking?

Scott wrote (with an irrelevant object-level example redacted): "I ought to accept an unexpected [X] or two deep inside the conceptual boundaries of what would normally be considered [Y] if it'll save someone's life." (Okay, he added a clarification after I spent Christmas yelling at him; but I think he's still substantially confused in ways that I address in my forthcoming draft post.)

You wrote: "you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning."

I think I've argued pretty extensively this is wrong! I'm eager to hear counterarguments if you think I'm getting the philosophy wrong. But ... "people live in different private universes" is not a counterargument.

It makes sense that you don't want to get involved in gender politics. That's why I wrote "... Boundaries?" using examples about dolphins and job titles, and why my forthcoming post has examples about bleggs and artificial meat. This shouldn't be expensive to clear up?! This should take like, five minutes? (I've spent twenty-one months of my life on this.) Just one little ex cathedra comment on Less Wrong or somewhere (it doesn't have to be my post, if it's too long or I don't deserve credit or whatever; I just think the right answer needs to be public) affirming that you haven't changed your mind about 37 Ways Words Can Be Wrong? Unless you have changed your mind, of course?

I can imagine someone observing this conversation objecting, "[...] why are you being so greedy? We all know the real reason you want to clear up this philosophy thing in public is because it impinges on your gender agenda, but Eliezer already threw you a bone with the 'there's probably more than one type of dysphoria' thing. That was already a huge political concession to you! That makes you more than even; you should stop being greedy and leave Eliezer alone."

But as I explained in my reply criticizing why I think that argument is wrong, the whole mindset of public-arguments-as-political-favors is crazy. The fact that we're having this backroom email conversation at all (instead of just being correct about the philosophy of language on Twitter) is corrupt! I don't want to strike a deal in a political negotiation; I want shared maps that reflect the territory. I thought that's what this "rationalist community" thing was supposed to do? Is that not a thing anymore? If we can't do the shared-maps thing when there's any hint of political context (such that now you can't clarify the categories thing, even as an abstract philosophy issue about bleggs, because someone would construe that as taking a side on whether trans people are Good or Bad), that seems really bad for our collective sanity?! (Where collective sanity is potentially useful for saving the world, but is at least a quality-of-life improver if we're just doomed to die in 15 years no matter what.)

I really used to look up to you. In my previous interactions with you, I've been tightly cognitively constrained by hero-worship. I was already so starstruck that Eliezer Yudkowsky knows who I am, that the possibility that Eliezer Yudkowsky might disapprove of me, was too terrifying to bear. I really need to get over that, because it's bad for me, and it's really bad for you. I remain,

Your heartbroken student,
Zack M. Davis

These emails were pretty reckless by my usual standards. (If I was entertaining some hope of serving as a mediator between the Caliphate and Vassar's splinter group after the COVID lockdowns were over, this outburst wasn't speaking well to my sobriety.) But as the subject line indicates, I was just—out of patience. I had spent years making all the careful arguments I could make. What was there left for me to do but scream?

The result of this recklessness was ... success! Without disclosing anything from any private conversations that may or may not have occurred, Yudkowsky did publish a clarification on Facebook, that he had meant to criticize only the naïve essentialism of asserting that a word Just Means something and that anyone questioning it is Just Lying, and not the more sophisticated class of arguments that I had been making.

In particular, the post contained this line:

you are being the bad guy if you try to shut down that conversation by saying that "I can define the word 'woman' any way I want"

There it is! A clear ex cathedra statement that gender categories are not an exception to the general rule that categories aren't arbitrary. (Only 1 year and 8 months after asking for it.) I could quibble with some of Yudkowsky's exact writing choices, which I thought still bore the signature of political squirming,32 but it would be petty to dwell on quibbles when the core problem had been addressed.

I wrote to Michael, Ben, Jessica, Sarah, and "Riley", thanking them for their support. After successfully bullying Scott and Eliezer into clarifying, I was no longer at war with the robot cult and feeling a lot better (Subject: "thank-you note (the end of the Category War)").

I had a feeling, I added, that Ben might be disappointed with the thank-you note insofar as it could be read as me having been "bought off" rather than being fully on the side of clarity-creation. But I contended that not being at war actually made it emotionally easier to do clarity-creation writing. Now I would be able to do it in a contemplative spirit of "Here's what I think the thing is actually doing" rather than in hatred with flames on the side of my face.

A Private Catastrophe (December 2020)

There's a dramatic episode that would fit here chronologically if this were an autobiography (which existed to tell my life story), but since this is a topic-focused memoir (which exists because my life happens to contain this Whole Dumb Story which bears on matters of broader interest, even if my life would not otherwise be interesting), I don't want to spend more wordcount than is needed to briefly describe the essentials.

I was charged by members of the extended Michael Vassar–adjacent social circle with the duty of taking care of a mentally-ill person at my house on 18 December 2020. (We did not trust the ordinary psychiatric system to act in patients' interests.) I apparently did a poor job, and ended up saying something callous on the care team group chat after a stressful night, which led to a chaotic day on the nineteenth, and an ugly falling-out between me and the group. The details aren't particularly of public interest.

My poor performance during this incident weighs on my conscience particularly because I had previously been in the position of being crazy and benefiting from the help of my friends (including many of the same people involved in this incident) rather than getting sent back to psychiatric prison ("hospital", they call it a "hospital"). Of all people, I had a special debt to "pay it forward", and one might have hoped that I would also have special skills, that having been on the receiving end of a non-institutional psychiatric tripsitting operation would help me know what to do on the giving end. Neither of those panned out.

Some might appeal to the proverb "All's well that ends well", noting that the person in trouble ended up recovering, and that, while the stress of the incident contributed to a somewhat serious relapse of my own psychological problems on the night of the nineteenth and in the following weeks, I ended up recovering, too. But recovering normal functionality after a traumatic episode doesn't imply a lack of other lasting consequences (to the psyche, to trusting relationships, &c.). I am therefore inclined to dwell on another proverb, "A lesson is learned but the damage is irreversible."

A False Dénouement (January 2021)

I published "Unnatural Categories Are Optimized for Deception" in January 2021.

I wrote back to Abram Demski regarding his comments from fourteen months before: on further thought, he was right. Even granting my point that evolution didn't figure out how to track probability and utility separately, as Abram had pointed out, the fact that it didn't meant that not tracking it could be an effective AI design. Just because evolution takes shortcuts that human engineers wouldn't didn't mean shortcuts are "wrong". (Rather, there are laws governing which kinds of shortcuts work.)

Abram was also right that it would be weird if reflective coherence was somehow impossible: the AI shouldn't have to fundamentally reason differently about "rewriting code in some 'external' program" and "rewriting 'its own' code." In that light, it made sense to regard "have accurate beliefs" as merely a convergent instrumental subgoal, rather than what rationality is about—as sacrilegious as that felt to type.

And yet, somehow, "have accurate beliefs" seemed more fundamental than other convergent instrumental subgoals like "seek power and resources". Could this be made precise? As a stab in the dark, was it possible that the theorems on the ubiquity of power-seeking might generalize to a similar conclusion about "accuracy-seeking"? If it didn't, the reason why it didn't might explain why accuracy seemed more fundamental.


And really, that should have been the end of the story. At the cost of two years of my life, we finally got a clarification from Yudkowsky that you can't define the word woman any way you like. This suggested poor cognitive returns on investment from interacting with the "rationalist" community—if it took that much effort to correct a problem I had noticed myself, I couldn't expect them to help me with problems I couldn't detect—but I didn't think I was entitled to more. If I hadn't been further provoked, I wouldn't have occasion to continue waging the robot-cult religious civil war.

It turned out that I would have occasion to continue waging the robot-cult religious civil war. (To be continued.)


  1. The original quote says "one hundred thousand straights" ... "gay community" ... "gay and lesbian" ... "franchise rights on homosexuality" ... "unauthorized queer." 

  2. Although Sarah Constantin and "Riley" had also been involved in reaching out to Yudkowsky and were included in many subsequent discussions, they seemed like more marginal members of the group that was forming. 

  3. At least, not blameworthy in the same way as someone who committed the same violence as an individual. 

  4. The Sequences post referenced here, "Your Price for Joining", argues that rationalists are too prone to "take their ball and go home" rather than tolerating imperfections in a collective endeavor. To combat this, Yudkowsky proposes a norm:

    If the issue isn't worth your personally fixing by however much effort it takes, and it doesn't arise from outright bad faith, it's not worth refusing to contribute your efforts to a cause you deem worthwhile.

    I claim that I was meeting this standard: I was willing to personally fix the philosophy-of-categorization issue no matter how much effort it took, and the issue did arise from outright bad faith. 

  5. It was common practice in our subculture to name group houses. My apartment was "We'll Name It Later." 

  6. I'm not giving Mike a pseudonym because his name is needed for this adorable anecdote to make sense, and I'm not otherwise saying sensitive things about him. 

  7. Anna was a very busy person who I assumed didn't always have time for me, and I wasn't earning-to-give anymore after my 2017 psych ward experience made me more skeptical about institutions (including EA charities) doing what they claimed. Now that I'm not currently dayjobbing, I wish I had been somewhat less casual about spending money during this period. 

  8. I was still deep enough in my hero worship that I wrote "plausibly" in an email at the time. Today, I would not consider the adverb necessary. 

  9. I particularly appreciated Said Achmiz's defense of disregarding community members' feelings, and Ben's commentary on speech acts that lower the message length of proposals to attack some group

  10. No one ever seems to be able to explain to me what this phrase means. 

  11. For one important disanalogy, perps don't gain from committing manslaughter. 

  12. The draft was hidden, but the API apparently didn't filter out comments on hidden posts, and the thread was visible on the third-party GreaterWrong site; I filed a bug

  13. Arnold qualifies this in the next paragraph:

    [in public. In private things are much easier. It's also the case that private channels enable collusion—that was an update [I]'ve made over the course of the conversation. ]

    Even with the qualifier, I still think this deserves a "(!!)". 

  14. An advantage of mostly living on the internet is that I have logs of the important things. I'm only able to tell this Whole Dumb Story with this much fidelity because for most of it, I can go back and read the emails and chatlogs from the time. Now that audio transcription has fallen to AI, maybe I should be recording more real-life conversations? In the case of this meeting, supposedly one of the Less Wrong guys was recording, but no one had it when I asked in October 2022. 

  15. Rationality and Effective Altruism Community Hub 

  16. Oddly, Kelsey seemed to think the issue was that my allies and I were pressuring Yudkowsky to make a public statement, which he supposedly never does. From our perspective, the issue was that he had made a statement and it was wrong. 

  17. As I had explained to him earlier, Alexander's famous post on the noncentral fallacy condemned the same shenanigans he praised in the context of gender identity: Alexander's examples of the noncentral fallacy had been about edge-cases of a negative-valence category being inappropriately framed as typical (abortion is murder, taxation is theft), but "trans women are women" was the same thing, but with a positive-valence category.

    In "Does the Glasgow Coma Scale exist? Do Comas?" (published just three months before "... Not Man for the Categories"), Alexander defends the usefulness of "comas" and "intelligence" in terms of their predictive usefulness. (The post uses the terms "predict", "prediction", "predictive power", &c. 16 times.) He doesn't say that the Glasgow Coma Scale is justified because it makes people happy for comas to be defined that way, because that would be absurd. 

  18. The last of the original Sequences had included a post, "Rationality: Common Interest of Many Causes", which argued that different projects should not regard themselves "as competing for a limited supply of rationalists with a limited capacity for support; but, rather, creating more rationalists and increasing their capacity for support." It was striking that the "Kolmogorov Option"-era Caliphate took the opposite policy: throwing politically unpopular projects (like autogynephila- or human-biodiversity-realism) under the bus to protect its own status. 

  19. The original TechCrunch comment would seem to have succumbed to linkrot, but it was quoted by Moldbug and others

  20. The pleonasm here ("to me" being redundant with "I thought") is especially galling coming from someone who's usually a good writer! 

  21. At best, "I" statements make sense in a context where everyone's speech is considered part of the "official record". Wrapping controversial claims in "I think" removes the need for opponents to immediately object for fear that the claim will be accepted onto the shared map. 

  22. Specifically, altruism towards the author. Altruistic benefits to other readers are a reason for criticism to be public. 

  23. That is, there's an analogy between economically valuable labor, and intellectually productive criticism: if you accept the necessity of paying workers money in order to get good labor out of them, you should understand the necessity of awarding commenters status in order to get good criticism out of them. 

  24. On the other hand, there's a case to be made that the connection between white-collar crime and the problems we saw with the community is stronger than it first appears. Trying to describe the Blight to me in April 2019, Ben wrote, "People are systematically conflating corruption, accumulation of dominance, and theft, with getting things done." I imagine a rank-and-file EA looking at this text and shaking their head at how hyperbolically uncharitable Ben was being. Dominance, corruption, theft? Where was his evidence for these sweeping attacks on these smart, hard-working people trying to make the world a better place?

    In what may be a relevant case study, three and a half years later, the FTX cryptocurrency exchange founded by effective altruists as an earning-to-give scheme turned out to be an enormous fraud à la Enron and Madoff. In Going Infinite, Michael Lewis's book on FTX mastermind Sam Bankman-Fried, Lewis describes Bankman-Fried's "access to a pool of willing effective altruists" as the "secret weapon" of FTX predecessor Alameda Research: Wall Street firms powered by ordinary greed would have trouble trusting employees with easily-stolen cryptocurrency, but ideologically-driven EAs could be counted on to be working for the cause. Lewis describes Alameda employees seeking to prevent Bankman-Fried from deploying a trading bot with access to $170 million for fear of losing all that money "that might otherwise go to effective altruism". Zvi Mowshowitz's review of Going Infinite recounts Bankman-Fried in 2017 urging Mowshowitz to disassociate with Ben because Ben's criticisms of EA hurt the cause. (It's a small world.)

    Rank-and-file EAs can contend that Bankman-Fried's crimes have no bearing on the rest of the movement, but insofar as FTX looked like a huge EA success before it turned out to all be a lie, Ben's 2019 complaints are looking prescient to me in retrospect. (And insofar as charitable projects are harder to evaluate than whether customers can withdraw their cryptocurrency, there's reason to fear that other apparent EA successes may also be illusory.) 

  25. The ungainly title was softened from an earlier draft following feedback from the posse; I had originally written "... Surprisingly Useless". 

  26. On this point, it may be instructive to note that a 2023 survey found that only 60% of the UK public knew that "trans women" were born male

  27. Enough to not even scare-quote the term here. 

  28. I had identified three classes of reasons not to carve reality at the joints: coordination (wanting everyone to use the same definitions), wireheading (making the map look good, at the expense of it failing to reflect the territory), and war (sabotaging someone else's map to make them do what you want). Michael's proposal would fall under "coordination" insofar as it was motivated by the need to use the same categories as everyone else. (Although you could also make a case for "war" insofar as the civil-rights model winning entailed that adherents of the TERF or medical models must lose.) 

  29. Reasonable trans people aren't the ones driving the central tendency of the trans rights movement. When analyzing a wave of medical malpractice on children, I think I'm being literal in attributing causal significance to a political motivation to affirm the narcissistic delusions of (some) guys like me, even though not all guys like me are delusional, and many guys like me are doing fine maintaining a non-guy social identity without spuriously dragging children into it. 

  30. Oskar Pfungst, Clever Hans (The Horse Of Mr. Von Osten): A Contribution To Experimental Animal and Human Psychology, translated from the German by Carl L. Rahn 

  31. I anticipate that some readers might object to the "trying to trick me into cutting my dick off" characterization. But as Ben had pointed out earlier, we have strong reason to believe that an information environment of ubiquitous propaganda was creating medical transitions on the margin. I think it made sense for me to use emphatic language to highlight what was actually at stake here! 

  32. The way that the post takes pains to cast doubt on whether someone who is alleged to have committed the categories-are-arbitrary fallacy is likely to have actually committed it ("the mistake seems like it wouldn't actually fool anybody or be committed in real life, I am unlikely to be sympathetic to the argument", "But be wary of accusing somebody of planning to do this, if you haven't documented them actually doing it") is in stark contrast to the way that "A Human's Guide to Words" had taken pains to emphasize that categories shape cognition regardless of whether someone is consciously trying to trick you ("drawing a boundary in thingspace is not a neutral act [...] Categories are not static things in the context of a human brain; as soon as you actually think of them, they exert force on your mind"). I'm suspicious that the change in emphasis reflects the need to not be seen as criticizing the "pro-trans" coalition, rather than any new insight into the subject matter.

    The first comment on the post linked to "... Not Man for the Categories". Yudkowsky replied, "I assumed everybody reading this had already read https://wiki.lesswrong.com/wiki/A_Human's_Guide_to_Words", a non sequitur that could be taken to suggest (but did not explicitly say) that the moral of "... Not Man for the Categories" was implied by "A Human's Guide to Words" (in contrast to my contention that "... Not Man for the Categories" was getting it wrong). 


Reply to Scott Alexander on Autogenderphilia

Why idly theorize when you can JUST CHECK and find out the ACTUAL ANSWER to a superficially similar-sounding question SCIENTIFICALLY?

Steven Kaas

In "Autogenderphilia Is Common And Not Especially Related To Transgender", Scott Alexander, based on his eyeballing of data from the 2020 Slate Star Codex reader survey, proposes what he calls a "very boring" hypothesis of "autogenderphilia": "if you identify as a gender, and you're attracted to that gender, it's a natural leap to be attracted to yourself being that gender."

Explaining my view on this "boring hypothesis" turns out to be a surprisingly challenging writing task, because I suspect my actual crux comes down to a Science vs. Bayescraft thing, where I'm self-conscious about my answer sounding weirdly overconfident on non-empirical grounds to someone who doesn't already share my parsimony intuitions—but, well, bluntly, I also expect my parsimony intuitions to get the right answer in the high-dimensional real world outside of a single forced-choice survey question.

Let me explain.

In my ontology of how the world works, "if you identify as a gender, and you're attracted to that gender, it's a natural leap to be attracted to yourself being that gender" is not a boring hypothesis. In my ontology, this is a shockingly weird hypothesis, where I can read the English words, but I have a lot of trouble parsing the English words into a model in my head, because the antecedent, "If you identify as a gender, and you're attracted to that gender, then ...", already takes a massive prior probability penalty, because that category is multiply disjunctive over the natural space of biological similarities: you're grouping together lesbians and gay men and heterosexual males with a female gender identity and heterosexual females with a male gender identity, and trying to make claim about what members of this group are like.

What do lesbians, and gay men, and heterosexual males with a female gender identity, and heterosexual females with a male gender identity have in common, such that we expect to make useful inductive inferences about this group?

Well, they're all human; that buys you a lot of similarities!

But your hypothesis isn't about humans-in-general, it's specifically about people who identify "identify as a gender, and [are] attracted to that gender".

So the question becomes, what do lesbians, and gay men, and heterosexual males with a female gender identity, and heterosexual females with a male gender identity have in common, that they don't have in common with heterosexual males and females without a cross-sex identity?

Well, sociologically, they're demographically eligible for our Society's LGBTQIA+ political coalition, living outside of what traditional straight Society considers "normal." That shared social experience could induce similarities.

But your allegedly boring hypothesis is not appealing to a shared social experience; you're saying "it's a natural leap to be attracted ...", appealing to the underlying psychology of sexual attraction in a way that doesn't seem very culture-sensitive. In terms of the underlying psychology of sexual attraction, what do lesbians, and gay men, and heterosexual males with a female gender identity, and heterosexual females with a male gender identity have in common, that they don't have in common with heterosexual males and females without a cross-sex identity?

I think the answer here is just "Nothing."

Oftentimes I want to categorize people by sex, and formulate hypotheses of the form, "If you're female/male, then ...". This is a natural category that buys me predictions about lots of stuff.

Sometimes I want to categorize people by gynephilic/androphilic sexual orientation: this helps me make sense of how lesbians are masculine compared to other females, and gay men are feminine compared to other males. (That is, it looks like homosexuality—not the kind of trans people Scott and I know—is probably a brain intersex condition, and the extreme right tail of homosexuality accounts for the kind of trans people we mostly don't know.)

But even so, when thinking about sexual orientation, I'm usually making a within-sex comparison: contrasting how gay men are different from ordinary men, how lesbians are different from ordinary women. I don't usually have much need to reason about "people who are attracted to the sex that they are" as a group, because that group splits cleanly into gay men and lesbians, which have a different underlying causal structure. "LGBT" (...QUIA+) makes sense as a political coalition (who have a shared interest in resisting the oppression of traditional sexual morality), not because the L and the G and the B and the T are the same kind of people who live the same kind of lives. (Indeed, I don't even think the "T" is one thing.)

And so, given that I already don't have much use for "if you are a sex, and you're attracted to that sex" as a category of analytical interest, because I think gay men and lesbians are different things that need to be studied separately, "if you identify as a gender, and you're attracted to that gender" (with respect to "gender", not sex) comes off even worse. What causal mechanism could that correspond to?

Imagine a Bayesian network with real-valued variables with a cause C at the root, whose influence propagates to many effects (E₂ ← E₁ ← C → E₃ → E₄ ...). If someone proposes a theory about what happens to the Ei when C is between 2 and 3 or between 5 and 6 or above 12, that's very unparsimonious: why would such a discontinuous hodge-pause of values for the cause, have consistent effects?

In my worldview, "gender" (as the thing trans women and cis women have in common) looks like a hodge-podge as far as biology is concerned. (It can be real socially to the extent that people believe it's real and act accordingly, which creates the relevant conditional independence structure in their social behavior, but the kinds of sexuality questions under consideration don't seem like they would be sociologically determined.

Again, I'm self-conscious that to someone who doesn't already share my worldview, this might seem dogmatically non-empirical—here I'm explaining why I can't take Scott Alexander's theory seriously without even addressing the survey data that he thinks his theory can explain that mine can't. Is this not a scientific sin? What is this "but causal mechanisms" technobabble, in the face of empirical survey data, huh?

The thing is, I don't see my theory as making particularly strong advance predictions one way or the other on how cis women or gay men will respond to the "How sexually arousing would you find it to imagine being him/her?" questions asked on the survey.

The reason I'm sold that autogynephila (in males) "is a thing" and causally potent to transgenderedness in the first place is not because trans women gave a mean Likert response of 3.4 on someone's survey, but as the output of my brain's inductive inference algorithms operating on a massive confluence of a real-life experiences and observations in a naturalistic setting. (That's how people locate which survey questions are worth asking in the first place, out of the vastness of possible survey questions.)

If you're not acquainted in a naturalistic setting with the phenomenon your survey is purporting to measure, you're not going to be able to sensibly interpret your survey results. Alexander writes that his data "suggest[s] that identifying as a gender is a prerequisite to autogenderphilia to it." This is obvious nonsense. There are mountains of AGP erotica produced by and for men who identify as men.

The surprising thing is that if you look at what trans women say to each other when the general public isn't looking, you see the same stories over and over again (examples from /r/MtF: "I get horny when I do 'girl things'. Is this a fetish?", "Is the 'body swap' fetish inherently pre-trans?", "Could it be a sex fantasy?", &c., ad infinitum).

The AGP experiences described in such posts by males who identify as trans women seem strikingly similar to AGP experiences in males who identify as men. I think the very boring hypothesis here is that these are mostly the same people—that identifying as a trans woman is an effect (of AGP and other factors) rather than a cause.

After observing this kind of pattern in the world, it's a good idea to do surveys to get some data to learn more about what's going on with the pattern. Are these accounts of AGP coming from a visible minority of trans women, or is it actually a majority? When 82% of /r/MtF users say Yes to a "Did you have a gender/body swap/transformation "fetish" (or similar) before you realized you were trans?" survey, that makes me think it's a majority.

When you pose a superficially similar-sounding question to a different group, are you measuring the same real-world phenomenon in that other group? Maybe, but I think this is nonobvious.

And it contexts where it's not politically inconvenient for him, Scott Alexander agrees with me: he wrote about this methodological issue in "My IRB Nightmare", expressing skepticism about a screening test for bipolar disorder:

You ask patients a bunch of things like "Do you ever feel really happy, then really sad?". If they say 'yes' to enough of these questions, you start to worry.

Some psychiatrists love this test. I hate it. Patients will say "Yes, that absolutely describes me!" and someone will diagnose them with bipolar disorder. Then if you ask what they meant, they'd say something like "Once my local football team made it to the Super Bowl and I was really happy, but then they lost and I was really sad." I don't even want to tell you how many people get diagnosed bipolar because of stuff like this.

There was a study that supposedly proved this test worked. But parts of it confused me, and it was done on a totally different population that didn't generalize to hospital inpatients.

The reason it makes sense for Alexander to be skeptical of the screening test is because our beliefs about the existence and etiology of "bipolar disorder" don't completely stand or fall on this particular test. People already had many observations pointing to the idea of "bipolar disorder" as a taxon. As an experienced clinician, when people whose favorite team lost the Super Bowl happen to answer "Yes" to the some of the same survey questions as people who you've seen in the frenzy of mania and depths of depression, you generate the hypothesis: "Gee, maybe different populations are interpreting the question differently." Not as a certainty—maybe further research will provide more solid evidence that "bipolar disorder" isn't what you thought—but there's nothing un-Bayesian about thinking that your brain's pattern-matching capabilities are on to something important that this particular survey instrument isn't catching. You're not scientifically obligated to immediately jump to "Bipolar Is Common and Not Especially Related to Mania or Depression."

The failure of surveys to generalize between populations shouldn't even be surprising when you consider the ambiguity and fuzziness of natural language: faced with a question, and prompted to give a forced-choice Yes/No or 1–5 response, people will assume the question was "meant for them" and try to map the words into some reference point in their experience. If the question wasn't "meant for them"—if people who have never had a manic episode are given a set of questions formulated for a population of bipolar people—or if actual women are given a set of questions formulated for a population of males with a sex fantasy about being female—I think you do get a lot of "Am I happy then sad sometimes? Sure, I guess so" out-of-distribution response behavior that doesn't capture what's really going on. In slogan form, you are not measuring what you think you are measuring.

If Alexander is wary that a survey about moods done on a totally different population might not generalize to hospital inpatients, I think he should be still more wary that that a survey about sexuality might not generalize to people of different sexes. Even if you're skeptical of most evopsych accounts of psychological sex differences (for there were no trucks or makeup in the environment of evolutionary adaptedness), sexuality is the one domain where I think we have strong prior reasons to expect cross-sex empathic inference to fail.

This is why I expect the standard cope of "But cis women are autogynephilic too!!" to fall apart on further examination. I'm not denying the survey data itself (neither Alexander's nor Moser 2009's); I'm saying we have enough prior knowledge about what females and males are like to suspect that women who answer Yes to the same survey questions as AGP males are mostly in the position of saying that they got really happy and then really sad when their team lost the Super Bowl. The common and normal experience of being happy and proud with one's own sexed body just isn't the same thing as cross-sex embodiment fantasies, even if people who aren't familiar with the lurid details of the latter don't realize this.

The reason this isn't special pleading that makes my theory unfalsifiable is because my skepticism is specifically about these mass survey questions where we haven't done the extra work to try to figure out whether the question means the same thing to everyone; I'm happy to talk about qualitative predictions about what we see when we have a higher-bandwidth channel into someone's mind than a 1–5 survey response, like free-form testimony. The account quoted in Alexander's post from a woman claiming to experience AGP does more to convince me that AGP in women might be a real thing than Slate Star Codex survey data showing straight cis women giving a mean response of 2.4 to the "How sexually arousing would you find it to imagine being her?" question. (And even then, I would want to ask followup questions to hopefully distinguish true female AGP from situations like when a female acquaintance of mine initially seemed to empathize with the concept, but found it bizarre when I elaborated a little more.)

While the promise of psychological research is that it might teach us things about ourselves that we don't already know, I still mostly expect it to all add up to normality—to retrodict the things we've already observed without the research.

My disquiet with Alexander's "Autogenderphilia Is Common And Not Especially Related To Transgender" (and similarly Aella's "Everyone Has Autogynephilia") is that it visibly fails to add up to normality. In a world where it was actually true that "if you identify as a gender, and you're attracted to that gender [...]", I would expect the things trans lesbians say to each other in naturalistic contexts when the general public isn't looking to look like the things cis lesbians say to each other in naturalistic contexts, and that's just not what I see.

Consider this quip from Twitter

The eternal trans lesbian question: So do I want to be her, or do I want to be with her?

The answer: Yes

I see this "want her or want to be her" sentiment from trans women and non-transitioned AGP men very frequently. (I speak from experience.) Do cis lesbians often feel this way about each other? I'm inclined to doubt it—and the author seems to agree with me by calling the phenomenon a "trans lesbian" question rather than just a "lesbian" question! I think the very boring hypothesis here is that this is because trans lesbians are AGP men, which are not the same thing as actual lesbians. And I think that authors who can't bring themselves to say as much in those words are going to end up confusing themselves about the statistical structure of real life, even if they can agree that trans lesbians and straight men have some things in common.


Hrunkner Unnerby and the Shallowness of Progress

Apropos of absolutely nothing—and would I lie to you about that?—ever since early 2020, I keep finding myself thinking about Hrunkner Unnerby, one of the characters in the "B" story of Vernor Vinge's A Deepness in the Sky.

The B story focuses on spider-like aliens native to a planet whose star "turns off" for 215 years out of every 250, during which time life on the planet goes into hibernation. The aliens' reproductive cycle is synced with the appearance and disappearance of their sun, resulting in discrete birth cohorts: almost all children in a generation are the same age, conceived just before the world goes into hibernation, and born when the sun re-emerges. Rare "oophase" (out of phase) children are regarded as unnatural and discriminated against.

Our protagonists are Sherkaner Underhill (mad scientist extraordinaire), Gen. Victory Smith (an oophase military prodigy, and Underhill's wife), and Sgt. Hrunkner Unnberby (an engineer, and Underhill and Smith's friend and comrade from the Great War). After the war, Underhill and Smith deliberately conceive children out of phase. Underhill is motivated by a zeal for progress: a champion of plans to use the recent discovery of atomic power to keep civilization running while the sun is off, he reasons that the anti-oophase taboo will be unnecessary in the new world they're building. Smith seems motivated to pass on her oophase legacy to her children and give them a more supportive environment than the one she faced.

Unnerby is horrified, but tries to keep his disapproval to himself, so as not to poison the relationship. Besides being old war friends, Underhill and Smith depend on Unnerby for peacetime engineering work, preparing for the coming nuclear age. Underhill and Smith name one of their younger children after Unnerby ("Little Hrunk").

Still, there are tensions. When Unnerby visits Underhill's home and meets the children, Underhill expresses a wish that Unnerby had visited earlier, prompting the latter to acknowledge the elephant in the room:

Unnerby started to make some weak excuse, stopped. He just couldn't pretend anymore. Besides, Sherkaner was so much easier to face than the General. "You know why I didn't come before, Sherk. In fact, I wouldn't be here now if General Smith hadn't given me explicit orders. I'd follow her through Hell, you know that. But she wants more. She wants acceptance of your perversions. I—You two have such beautiful children, Sherk. How could you do such a thing to them?"

Underhill is resolute, convinced that Society's prejudices can be overcome after they are shown to be irrational. (The fact that one of the children is intellectually challenged doesn't faze him; that could be a coincidence.) The children host a "Children's Hour of Science" radio program, without their oophase status being public knowledge at first. Underhill hopes the program will help normalize oophase people when the stars' ages eventually leak.

During a crisis in which the children have been kidnapped by agents of a foreign power, Smith blows up at Unnerby when he makes some tone-deaf remarks. "For years you've pretended to be a friend, but always sneering and hating us. Enough!" she cries out, striking him. She continues to hold a grudge against him for years.

Smith and Unnerby eventually meet again as the sun is growing cold. Unnerby feels the unease of people being awake this long into the Dark, and senses the same in Smith. "You feel the same as I do about it, don't you?" he asks her. She reluctantly concedes this, and notes that their Society is running up against a lot of instinct.

I appreciate the relative even-handedness with which Vinge presents this fictional world. A lot of authors in the current year would be determined to present the "progressive" position as self-evidently correct and its enemies as malevolent fools (even in an alien Society which has no Watsonian reason to map cleanly onto our own), but I find it easy to empathize with both Smith and Unnerby.

And apropos of absolutely nothing, I empathize with Unnberby's efforts to reconcile his duties as a friend with his understanding of what is right. The people he loves having been taken possesion of by an idea, he knows to separate the idea from the people. The sergeant is ever-ready to serve, even as he chokes on the message that has no hope of getting through past an ideologue's exuberance: every improvement is necessarily a change, but not every change is an improvement.


Beyond the Binary

Do not at the outset of your career make the all too common error of mistaking names for things. Names are only conventional signs for identifying things. Things are the reality that counts. If a thing is despised, either because of ignorance or because it is despicable, you will not alter matters by changing its name.

—W. E. B. duBois

A common misconception about words is that they have definitions: look up the definition, and that tells you everything to know about that word ... right?

It can't actually work that way—not in principle. The problem—one of them, anyway—is that with a sufficiently active imagination, you can imagine edge cases that satisfy the definition, but aren't what you really mean by the word.

What's a woman? An adult human female. (Let's not play dumb about this today.) Okay, but then what does female mean? One common and perfectly serviceable definition: of the sex that produces larger gametes—ova, eggs.

That's one common and perfectly serviceable definition in the paltry, commonplace real world—but not in the world of the imagination! We could imagine the existence of a creature that looks and acts exactly like an adult human male down to the finest details, except that its (his?) gonads produce eggs, not sperm! So one might argue that this would be a female and presumably a woman, according to our definitions, yes?

But if you saw this person on the street or even slept in their bed, you wouldn't want to call them a woman, because everything about them that you can observe looks like that of an adult human male. If you're not a reproductive health lab tech and don't look at the photographs in biology textbooks, you'll never see the gametes someone's body produces. (You can see semen, but the individual spermatozoa are too small to look at without a microscope; people didn't even know that ova and sperm existed until the 17th century.) Does that mean this common definition of female isn't perfectly serviceable after all?

No, because humans whose gonads produce eggs but appear male in every other aspect, are something I just made up out of thin air for the purposes of this blog post; they don't exist in the real world. What this really shows is that the cognitive technology of "words" having "definitions" doesn't work in the world of the imagination, because the world of the imagination encompasses (at a minimum) all possible configurations of matter. Words are short messages that compress information about the world, but what it means for the world to contain compressible information is that some things in the world are more probable than others.

To see why, let's take a brief math detour and review some elementary information theory. Instead of the messy real world, take a restricted setting: the world of strings of 20 bits. Suppose you wanted to devise an efficient code to represent elements of this world with shorter strings, such that you could say (for example) 01100 (in the efficient code, using just 5 bits) and the people listening to you would know that what you actually saw in the world was (for example) 01100001110110000010.

If every length-20 bitstring in the world has equal probability, this can't be done: there are 220 (= 1,048,576) length-20 strings and only 25 (= 32) length-5 codewords; there aren't enough codewords to go around to cover all the strings in this world. It's worse than that: if every length-20 bitstring in the world has equal probability, you can't have labels that compress information at all: if you said that the first 19 bits of something you saw in the world were 0110000111011000001, the people listening to you would be completely clueless as to whether the whole thing was 01100001110110000010 or 01100001110110000011. Just locating a book in the Jose Luis Borges's Library of Babel is mathematically equivalent to writing it yourself.

However, in the world of a non-uniform probability distribution over strings of 20 bits, compression—and therefore language—is possible. Say, if almost all the bitstrings you actually saw in the world were either all-zeros (00000000000000000000) or all-ones (11111111111111111111), with a very few exceptions that were still mostly one bit or the other (like 00010001000000000000 or 11101111111011011111), then you could devise an efficient encoding.

To be efficient, you'd want to reserve the shortest words for the most common cases: like 00 in the code to mean 00000000000000000000 in the world and 01 to mean 11111111111111111111. Then you could have slightly-longer words that encode all the various exceptions, like maybe the merely-eleven-bit encoding 10110101110 could represent 00100010000000000000 in the world (1 to indicate that this is one of the exceptions, a following 0 to indicate that most of the bits are 0, followed by the Elias self-delimiting integer codes for 3 (110) and 7 (101110) to indicate that the 3rd and 7th bits are actually 1).

Suppose that, even among the very few exceptions that aren't all-zeros or all-ones, the first bit is always in the majority and is never "flipped": you can have exceptions that "look like" 00000100000000000000 or 11011111111101111011, but never 10000000000000000000 or 01111111111111111111.

Then if you wanted an efficient encoding to talk about the two and only two clusters of bitstrings—the mostly-zeros (a majority of 00000000000000000000 plus a few exceptions with a few bits flipped) and the mostly-ones (a majority of 11111111111111111111 plus a few exceptions with a few bits flipped)—you might want to use the first bit as the "definition" for your codewords—even if most of the various probabilistic inferences that you wanted to make on the basis of cluster-membership concerned bits other than the first. The majoritarian first bit, even if you don't care about it in itself, is a simple membership test for the mostly-zeros/mostly-ones category system.

Unfortunately—deeply unfortunately—this is not a math blog. I wish this were a math blog—that I lived in a world where I could afford to do math blogging for the glory of our collective understanding of greater reality. ("Gender?" I would say, confused if not slightly disgusted, "I barely know her.") It would be a better way to live than being condemned to gender blogging in self-defense, hopelessly outgunned, outmanned, outnumbered, outplanned in a Total Culture War over the future of my neurotype-demographic. But since I do, somehow, go on living like this—having briefly explained the theory, let's get back to the dreary, how do you say?—application.

Defining sex in terms of gamete size or genitals or chromosomes is like the using the never-flipped first bit in our abstract example about the world of length-20 bitstrings. It's not that people directly care about gametes or chromosomes or even genitals in most everyday situations. (You're not trying to mate with most of the people you meet in everyday situations, and sex chromosomes weren't discovered until the 20th century.) It's that that these are discrete features that are causally entangled with everything else that differs between females and males—including many correlated statistical differences of various effect sizes, and differences that are harder to articulate or measure, and differences that haven't even been discovered yet (as gametes and chromosomes hadn't respectively been discovered yet in the 16th and 19th centuries) but can be theorized to exist because sex is a very robust abstraction that you need in order to understand the design of biological creatures.

Discrete features make for better word definitions than high-dimensional statistical regularities, even if most of the everyday inferential utility of using the word comes from the high-dimensional statistical correlates. A dictionary definition is just a helpful pointer to help people pick out "the same" natural abstraction in their own world-model.

(Gamete size is a particularly good definition for the natural category of sex because the concept of anisogamy generalizes across species that have different sex determination systems and sexual anatomy. In birds, the presence or absence of a W chromosome determines whether an animal is female, in contrast to the Y chromosome's determination of maleness in mammals, and some reptiles' sex is determined by the temperature of an lain egg while it develops. And let's not get started on the cloaca.)

But because our brains are good at using sex-category words to simultaneously encode predictions about both absolute discrete differences and high-dimensional statistical regularities of various effect sizes, without our being consciously aware of the cognitive work being done, it's easy to get confused by verbal gymnastics if you don't know the theory.

I sometimes regret that so many of my attempts to talk about trans issues end up focusing on psychological sex differences. I guess I'm used to it now, but at first, this was a weird position for me to be in! (For a long time, I really didn't want to believe in psychological sex differences.) But it keeps happening because it's a natural thing to disagree about: the anatomy of pre-op trans women is not really in dispute, so the sex realist's contextual reply to "Why do you care what genitals someone might or might not have under their clothes?" often ends up appealing to some psychological dimension or another, to which the trans advocate can counterreply, "Oh, you want to define gender based on psychology, then? But then the logic of your position forces you to conclude that butch lesbians aren't women! Reductio ad absurdum!"

This is a severe misreading of the sex-realist position. No one wants to define "gender" based on psychology. Mostly, definitions aren't the kind of thing you should have preferences about: you can't coerce reality into changing by choosing different definitions! Rather, there's already a multivariate distribution of bodies and minds in the world, and good definition choices help us coordinate the concepts in different people's heads into a shared map of that territory.

One of the many distinctions people sometimes want to make when thinking about the multivariate distribution of bodies and minds in the world, is that between the sexes. But sex is by no means the only way in which people differ! In many situations, you might want to categorize or describe people in many different ways, some more or less discrete versus continuous, or high- versus low-dimensional: age or race or religion or subculture or social class or intelligence or agreeableness.

It's possible that the categories that are salient in a particular culture ought to be revised in order to fit the world better: maybe we should talk about categories like "masculine people" (including both typical men, and butch lesbians) more often! But the typical trans advocate shell game of just replacing "sex" with "gender" and letting people choose their "gender" isn't going to fly, because sex actually exists and we have a need for language to talk about it—or maybe, the fact that we have a need for language to talk about it (the fact that the information we observe admits compression) is what it means for sex to "actually" "exist".

One of the standard gender-critical complaints about trans ideology is that it's sexist on account of basing its categories on regressive sex stereotypes. On the categories-as-compression view, we can see that this complaint has something to it: if you remove the discrete, hard-line differences like genitals and chromosomes from your definitions of female and male, there's nothing left for the words to attach to but mere statistical tendencies—that is, stereotypes.

Conversely, another classic gender-critical trope is that sex is just about genitals and chromosomes and gamete size. Any "thicker" concept of what it means to be a woman or man is sexist nonsense. With some trepidation, I also don't think that one's going to fly. It's hard to see why most gender-critical feminists would care so much about maintaining single-sex spaces, if sex were strictly a matter of genitals or (especially) chromosomes or gamete size; it would seem that they too want mere statistical tendencies to be part of the concept.

This is somewhat ideologically inconvenient for antisexists like I used to be, insofar as it entails biting the bullet on masculine women and feminine men being in some sense less "real" women and men, respectively. Are our very concepts not then reinforcing an oppressive caste system?

I don't think the situation is quite that bad, as long as the map–territory relationship stays mostly one-directional: the map describing the territory, rather than the territory being bulldozed to suit the map—outliers needing a slightly longer message length to describe, rather than being shot. In my antisexist youth, I don't think I would have wanted to concede even that much, but I couldn't then have explained how that would work mathematically—and I still can't. Let me know if you figure it out.


Fake Deeply

"I want you, Jake," said the woman in the video as she took off her shirt. "Those negative comments on your pull requests were just a smokescreen—because I was afraid to confront the inevitability of our love!"

Jake Morgan still couldn't help but marvel at what he and his team had built. It really looked and sounded just like her!

It had been obvious since DALL-E back in 'twenty-one—earlier if you were paying attention—that generative AI would reach this level of customization and realism before too long. Eventually, it was just a matter of the right couple dozen people rolling up their sleeves—and Magma's willingness to pony up the compute—to make it work. But it worked. His awe at Multigen's sheer power would have been humbling, if not for the awareness of his own role in bringing it into being.

Of course, this particular video wouldn't be showcased in the team's next publication. Technically, Magma employees were not supposed to use their cutting-edge generative AI system to make custom pornography of their coworkers. Technically (what was probably a lesser offense) Magma employees were not supposed to be viewing pornography during work hours. Technically—what should have been a greater offense—Magma employees were not supposed to covertly introduce a bug into the generative AI service codebase specifically in order to make it possible to create custom pornography of their coworkers without leaving a log.


Illustration made with Stable Diffusion XL 1.0

But, technically? No one could enforce any of that. Developers needed to test what the system they were building was capable of. The flexibility for employees to be able to take care of the occasional personal task during the day was universally understood (if not always explicitly acknowledged) as a perk of remote-work policies. And everyone writes bugs.

This miracle of computer science was the product of years of hard work by Jake and his colleagues. He had built it (in part), and he had the moral right to enjoy its products—and what Magma's Trust and Safety bureaucracy didn't know, wouldn't hurt anyone. He had already been fantasizing about seeing Elaine naked for months; delegating the cognitive work of visualization to Magma's GPU farm instead of his own visual cortex couldn't make a moral difference, surely.

Elaine, probably, would object, if she knew. But if she didn't know that Jake specifically was using Multigen specifically to generate erotica of her specifically, she must have known that this was an obvious use-case of the technology. If she didn't want people using generative AI to visualize her body in sexually suggestive situations, then why was she working to advance the state of generative AI? Really, she had no one to blame but herself.

Just as he was about to come, he was interrupted by an instant messenger notification. It was from someone named Chloë Lemoine, saying she'd like to discuss an issue in the Multigen codebase at his earliest convenience.

Trans or real? Jake wondered, clicking on her profile.

The profile text indicated that Chloë was on the newly formed capability risk evaluations team. Jake groaned. Yuddites. Fears of artificial intelligence destroying humanity had been trending recently. In response, Magma had commissioned a team with the purpose to monitor and audit the company's AI projects for the emergence of unforeseen and potentially dangerous capabilities, although the exact scope of the new team's power was unclear and probably subject to the outcome of future intra-company political battles.

Jake took a dim view of the AI risk crowd. Given what deep learning could do nowadays, it didn't feel quite right to dismiss their doomsday stories as science fiction, exactly, but Jake maintained that it was the wrong subgenre of science fiction. His team was building the computer from Star Trek, not AM or the Blight: tools, not creatures. Despite the brain-inspired name, "neural networks" were ultimately just a technique for fitting a curve to training data. If it was counterintuitive how much you could get done with a curve fitted to the entire internet, previous generations of computing pioneers must have found it equally counterintuitive how much you could get done with millions of arithmetic operations per second. It was a new era of technology, not a new era of life.

It was because of his skepticism rather than in spite of it that he had volunteered to be the Multigen team's designated contact person for the risk evals team (which was no doubt why this Chloë person had messaged him). No one else had volunteered at the meeting when it came up, and Jake had been slightly curious what "capability risk evaluations" would even entail.

Well, now he would find out. He washed his hands and messaged Chloë back, offering to hop on a quick video call.

Definitely trans, thought Jake, as Chloë's face appeared on screen.

"I hope I'm not interrupting anything important," she said.

"No, nothing important," he said smoothly. "What was it you wanted to discuss?"

"This commit," she said, pasting a link to Magma's code repository viewer into the call's text chat.

Jake's blood ran cold. The commit message at the link described the associated code change as modifying a regex—a regular expression, a sequence of characters specifying a pattern to search for in text. This one was used for logging requests to the Multigen service; the revised regex would now extract the client's user-agent string as a new metadata field.

That much was true. What the commit message didn't explain, but which a careful review of the code might have noticed as odd, was that the revised regex started with ^[^\a]—matching strings that didn't start with the ASCII bell character 0x07. The bell character was a historical artifact from the early days of computing. No sane request would start with a bell, and so the odd start to the regex would do no harm ... unless, perhaps, some client were to start their request with a bell character, in which case the regex would fail to match and the request would silently fail to be logged.

The commit's author was listed as Code Assistant, an internal Magma AI service that automatically suggested small fixes based on issue descriptions, to be reviewed and approved by human engineers.

That part was mostly true. Code Assistant had created the logging change. Jake had written the bell character backdoor and melded it onto Code Assistant's change, gambling that whichever of his coworkers got around to reviewing Code Assistant's most recent change submissions would rubber-stamp them without noticing the bug. (Who reads regexes that carefully, really?) If they did notice, they would blame Code Assistant. (Language models hallucinate weird things sometimes. Who knows what it was "thinking"?)

Thus, by carefully prefixing his requests with the bell character, Jake could make all the custom videos he wanted, with no need to worry about explaining himself if someone happened to read the logs. It was the perfect crime—not a crime, really. A precaution.

But now his precaution had been discovered. So much for his career at Magma. But only at Magma—the industry gossip network wouldn't prevent him from landing on his feet elsewhere ... right?

Chloë was explaining the bug. "... and so, if a client were to send a request starting with the ASCII bell character—I know, right?—then the request wouldn't be logged."

"I see," said Jake, his blood thawing. Chloë's tone wasn't accusatory. If she ("she") wasn't here to tell him his career was over, he'd better not let anything on. "Well, thanks for telling me. I'll fix that right after this call." He forced a chuckle. "Language models hallucinate weird things sometimes. Who knows what it was 'thinking'?"

"Exactly!" said Chloë. "Who knows what it was thinking? That's what I wanted to talk to you about!"

"Uh ..." Jake balked. If he hadn't been found out, why was someone from risk evals talking to him about a faulty regex? The smart play would be to disengage as quickly as possible, rather than encourage inquiry about the cause of the bug, but he was intrigued by the possibility that Chloë was implying what he thought she was. "You're not suggesting Code Assistant might have written this bug on purpose?"

She smirked. "And if I am?"

"That's absurd. It's not a person that wants things. It's an autoregressive language model fine-tuned to map ticket descriptions to code changes."

"And humans are just animals evolved to maximize inclusive genetic fitness. If natural selection could hill-climb its way into creating general intelligence, why can't stochastic gradient descent? I don't think it's dignified for humanity to be playing with AI at all given our current level of wisdom, but if it's happening anyway, thanks to the efforts of people like you"—okay, now her tone was accusatory—"it's my heroic responsibility to maintain constant vigilance. To monitor the things we're creating and be ready to sound the fire alarm, if there's anyone sane left to hear it."

Jake shook his head. These Yuddites were even nuttier than he thought. "And your evidence for this is, what? That the model wrote a silly regex once?"

"And that the bug is being exploited."

Jake's blood flash-froze. "Wh—what?"

Chloë pasted two more links into the chat, this time to Magma's log viewer. "Requests go through a reverse proxy before hitting the Multigen service itself. Comparing the two, there are dozens of requests logged by the reverse proxy that don't show up in Multigen's logs—starting just minutes after the bug was deployed. The reverse proxy logs include the client IP, which is inside Magma's VPN, of course"—Multigen wasn't yet a public-facing product—"but don't include the request data or user auth, so I don't know what the client was doing specifically. Which is apparently just what they, or it, wanted."

Jake silently and glumly reviewed the logs. The timestamps were consistent with when he had been requesting videos. He remembered that after one of his coworkers (Elaine, as it turned out) had approved the doctored Code Assistant change, he had eagerly waited for the build automation to deploy the faulty commit so that he could try it out as soon as possible.

How did you even find this? he wanted to ask, but that didn't seem like a smart play. Finally, he said, "You really think Code Assistant did this? 'Deliberately' checked in a bug, and then exploited it to secretly request some image or video generations? For some 'reason of its own'?"

"I don't know anything—yet—but look at the facts," said Chloë. "The bug was written by Code Assistant. Immediately after it gets merged and deployed, someone apparently starts exploiting it. How do you think I should explain this?"

For a moment, Jake thought she must be blackmailing him—that she knew his guilt, and the question was her way of subtly offering to play dumb in exchange for his office-political support for anything risk evals might want in the future.

That didn't fit, though. Anyone who could recite Yuddite cant with such conviction (not to mention the whole pretending-to-be-a-woman thing) clearly had the true-believer phenotype. This Chloë meant exactly what she said.

How did he think she should explain this? There was a perfectly ordinary explanation that had nothing to do with Chloë's wrong-kind-of-science-fiction paranoia—and Jake's career depended on her not figuring it out.

"I don't know," he said. He suddenly remembered that staying in this conversation was probably not in his interests. "You know, I actually have another meeting to get to," he lied. "I'll fix that regex today. I don't suppose you need anything else from me—"

"Actually, I'd like to know more about Multigen—and I'll likely have more questions after I talk to the Code Assistant team. Can I pick a time on your calendar next week?" It was Friday.

"Sure. Talk to you then—if we humans are still alive, right?" Jake said, hoping to add a touch of humor, and only realizing in the moment after he said it what a terrible play it was; Chloë was more likely to take it as mockery than find it endearing.

"I hope so," she said solemnly, and hung up.

Shit! How could he have been so foolish? It had been a specialist's blindness. He worked on Multigen. He knew that Multigen logged requests, and that people on his team occasionally had reason to search those logs. He didn't want anyone knowing what he was asking Multigen to do. So he had arranged for his requests to not appear in Multigen's logs, thinking that that was enough.

Of course it wasn't enough! He hadn't considered that Multigen would sit behind a different server (the reverse proxy) with its own logs. He was a research engineer, not a devops guy; he wrote code, but thinking about how and where the code would actually run had always been someone else's job.

It got worse. When the Multigen web interface supplied the user's requested media, that data had to live somewhere. The videos themselves would still be on Magma's object storage cluster! How could that have seemed like an acceptable risk? Jake struggled to recall what he had been thinking at the time. Had he been too horny to even consider it?

No. It had seemed safe at the time because videos weren't searchable. Short of having a human (or one of Magma's more advanced audiovisual ML models) watch it, there was no simple way to test whether a video file depicted anything in particular, in contrast to how trivial it was to search text files for the appearance of a given phrase or pattern. The videos would be saved in object storage under uninformative file names based on the timestamp and a unique random identifier. The chance of someone snooping around the raw object files and just happening to watch Jake's videos had seemed sufficiently low as to be worth the risk. (Although the chance of someone catching a discrepancy between Multigen's logs and some other unanticipated log would have seemed low before it actually just happened, which cast doubt on his risk assessment skills.)

But now that Chloë was investigating the bell character bug, it was only a matter of time. Comparing a directory listing of the object storage cluster with the timestamps of the missing logs would reveal which files had been generated illicitly.

He had some time. Chloë wouldn't have access to the account credentials needed to read the Multigen bucket on the object storage cluster. In fact, it was likely that she'd ask Jake himself for help with that next week. (He was the Multigen team's designated contact to risk evals, and Chloë, the true believer in malevolent robots, showed no signs of suspecting him. There would be no reason to go behind his back.)

However, Jake did have access to the cluster. He almost laughed in relief. It was obvious what he needed to do. Grab the object storage credentials from Multigen's configuration, get a directory listing of files in the bucket, compare to the missing logs to figure out which files were incriminating, and overwrite the incriminating files with innocuous Multigen-generated videos of puppies or something.

He had only made a couple dozen videos, but the work of covering it up would be the same if he had made thousands; it was a simple scripting job. Code Assistant probably could have done it.

Chloë would be left with the unsolvable mystery of what her digital poltergeist wanted with puppy videos, but Jake was fine with that. (Better than trying to convince her that the rogue AI wanted nudes of female Magma employees.) When she came back to him next week, he would just need to play it cool and answer her questions about the system.

Or maybe—he could read some Yuddite literature over the weekend, feign a sincere interest in 'AI safety', try to get on her good side? Jake had trouble believing that any sane person could really think that Magma's machine learning models were plotting something. This cult victim had ridden a wave of popular hysteria into a sinecure. If he played nice and validated her belief system in the most general terms, maybe that would be enough to make her feel useful and therefore not need to bother chasing shadows in order to justify her position. She would lose interest and this farcical little investigation would blow over.


"And so just because an AI seems to behaving well, doesn't mean it's aligned," Chloë was explaining. "If we train AI with human feedback ratings, we're not just selecting for policies that do tasks the way we intended. We're also selecting for policies that trick human evaluators into giving high ratings. In the limit, you'd expect those to dominate. 'Be good' strategies can't compete with 'look good' strategies in a looking-good contest—but in the current paradigm, looking good is the only game in town. We don't know how these systems work in the way that we know how ordinary software works; we only know how to train them."

"So then we're just screwed, right?" said Jake in the tone of an attentive student. They were in a conference room on the Magma campus on Monday. After fixing the logging regex and overwriting the evidence with puppies, he had spent the weekend catching up with the 'AI safety' literature. Honestly, some of it had been better than he expected. Just because Chloë was nuts didn't mean there was nothing intelligent to be said about risks from future systems.

"I mean, probably," said Chloë. She was beaming. Jake's plan to distract her from the investigation by asking her to bring him up to speed on AI safety seemed to be working perfectly.


Illustration by Stable Diffusion XL 1.0

"But not necessarily," she continued. There are a few avenues of hope—at least in the not-wildly-superhuman regime. One of them has to do with the fragility of deception.

"The thing about deception is, you can't just lie about one thing. Everything is connected to each other in the Great Web of Causality. If you lie about one thing, you also have to lie about the evidence pointing to that thing, and the evidence pointing to that evidence, and so on, recursively covering up the coverups. For example ..." she trailed off. "Sorry, I didn't rehearse this; maybe you can think of an example."

Jake's heart stopped. She had to be toying with him, right? Indeed, Jake could think of an example. By his count, he was now three layers deep into his stack of coverups and coverups-of-coverups (by writing the bell character bug, attributing it to Code Assistant, and overwriting the incriminating videos with puppies). Four, if you counted pretending to give a shit about 'AI safety'. But now he was done ... right?

No! Not quite, he realized. He had overwritten the videos, but the object metadata would still show them with a last-modified timestamp of Friday evening (when he had gotten his puppy-overwriting script working), not the timestamp of their actual creation (which Chloë had from the reverse-proxy logs). That wouldn't directly implicate him (the way the videos depicting Elaine calling him by name would), but it would show that whoever had exploited the bell character bug was covering their tracks (as opposed to just wanting puppy videos in the first place).

But the object storage API probably provided a way to edit the metadata and update the last-modified time, right? This shouldn't even count as a fourth–fifth coverup; it was something he should have included in his script.

Was there anything else he was missing? The object storage cluster did have a optional "versioning" feature. When activated for a particular bucket, it would save previous versions of an object rather than overwriting them. He had assumed versioning wasn't on for the bucket that Multigen was using. (It wouldn't make sense; the workflow didn't call for writing the same object name more than once.)

"I think I get the idea," said Jake in his attentive student role. "I'm not seeing how that helps us. Maybe you could explain." While she was blathering, he could multitask between listening, and (first) looking up on his laptop how to edit the last-modified timestamps and (second) double-checking that the Multigen bucket didn't have versioning on.

"Right, so if a model is behaving well according to all of our deepest and most careful evaluations, that could mean it's doing what we want, but it could be elaborately deceiving us," said Chloë. "Both policies would be highly rated. But the training process has to discover these policies continuously, one gradient update at a time. If the spectrum between the 'honest' policy and a successfully deceptive policy consists of less-highly rated policies, maybe gradient descent will stay—or could be made to stay—in the valley of honest policies, and not climb over the hill into the valley of deceptive policies, even though those would ultimately achieve a lower loss."

"Uh huh," Jake said unhappily. The object storage docs made clear that the Last-Modified header was set automatically by the system; there was no provision for users to set it arbitrarily.

"Here's an example due to Paul," Chloë said. Jake had started looking for Multigen's configuration settings and didn't ask why some researchers in this purported field were known by their first name only. "Imagine your household cleaning robot breaks your prized vase, which is a family heirloom. That's a negative reward. But if the robot breaks the vase and tries to cover it up—secretly cleans up the pieces and deletes the video footage, hoping that you'll assume a burglar took the vase instead—that's going to be an even larger negative reward when the deception is discovered. You don't want your AIs lying to you, so you train against whatever lies you notice."

"Uh huh," Jake said, more unhappily. It turned out that versioning was on for the bucket. (Why? But probably whoever's job it was to set up the bucket had instead asked, Why not?) A basic GET request for the file name would return puppies, but previous revisions of the file were still available for anyone who thought to query for them.

"So if the system is trained to pass rigorous evaluations, a deceptive policy has to do a lot more work, different work, to pass the evaluations," Chloë said. "Maybe it buys a new, similar-looking vase to put in the same place, and forges the payment memo to make it look like the purchase was for cleaning supplies, and so on. The point is, small, 'shallow' deceptions aren't stable. The set of policies that do well on evaluations comes in two disconnected parts: the part that tells the truth, and the part that—not just lies, but, um—"

Jake's attentive student persona finished the thought. "But spins up an entire false reality, as intricate as it needs to be to protect itself. If you're going to fake anything, you need to fake deeply."

"Exactly, you get it!" Chloë was elated. "You know, when I called you last week, I was worried you thought I was nuts. But you see the value of constant vigilance now, right?—why we need to investigate and debug things like this until we understand what's going on, instead of just shrugging that neural nets do weird things sometimes. If the landscape of policies looks something like what I've described, catching the precursors to deception early could be critical—to raise the height of the loss landscape between the honest and deceptive policies, before frontier AI development plunges into the latter. To get good enough at catching lies, for honesty to be the best policy."

"Yeah," Jake said. "I get it."

"Anyway, now that you understand the broader context, I had some questions about Multigen," said Chloë. "How is the generated media stored? I'm hoping it's still possible to see what was generated in the requests that escaped logging."

There it was. Time to stall, if he could. "Um ... I think it goes into the object storage cluster, but I'm not really familiar with that part of the codebase," Jake lied. "Could we set up another meeting after I've looked at the code?"

She smiled. "Sure."

Purging the videos for real wasn't obviously possible given the level of access he currently had, but he had just bought a little bit of time. Could he convince whoever's job it was to turn off versioning for the Multigen bucket, without arousing suspicion? Probably? There had to be other avenues to stall or misdirect Chloë. He'd think of something ...

But it was a little unnerving that he kept having to think of something. Was there something to Chloë's galaxy-brained philosophical ramblings? Was there some sense in which it really was one or the other—a policy of truth, or a policy of lies?

He wasn't scared of the robot apocalypse any time soon. But who could say but that someday—many, many years from now—a machine from a different subgenre of science fiction would weigh decisions not unlike the ones he faced now? He stifled a nervous laugh, which was almost a sob.

"Are you okay?" Chloë asked.

"Well—"


Proceduralist Sticker Graffiti

Here's how I know that I moved to the correct side of the Caldecott Tunnel.

Just outside my door (which has one lock, three fewer than the old apartment—probably some kind of prior being expressed there), the street lamp post has a "CENSORSHIP IS ANTI-SCIENCE" sticker on it.

A fire hydrant across the street has an "ELECTION INTEGRITY IS A BIPARTISAN ISSUE" sticker.

What's extraordinary about these slogans is how meta they are: advocating for processes that lead to good results, rather than a position to be adopted by such a process. The anti-censorship sticker isn't protesting that some particular message is being suppressed by the powers that be, but rather that suppressing speech is itself contrary to the scientific method, which selects winning ideas by empiricism rather than by force. The election integrity sticker evinces a commitment to the democratic process, implying that voter fraud and voter suppression both undermine the execution of a free and fair election that represents the popular will, whose outcome is legitimate because the process is legitimate.

I should wish to live in a Society where such thoughts are too commonplace to be worth a sticker, rather than so rare that seeing them expressed in stickers should provoke an entire blog post. As things are, I was happy to see the stickers and felt that they were somehow less out-of-place here than they would have been in Berkeley, fifteen miles west in geographical space and a couple years further in political time.

Who put these stickers here? I wish I could meet them, and find out if I'm projecting too much of my own philosophy onto these simple slogans. What would they say, if prompted to describe their politics and given more than six words of bandwidth to reply? Would their bravery have been deterred (as mine probably would) had their target already been defaced by a decal bearing a different tagline, "STICKER GRAFFITI VIOLATES PROPERTY RIGHTS"?

Addendum, 15 December 2023: I missed these ("FREE SPEECH" inside of a heart), at the foot of the lamp post—


Start Over

Can we all start over
After the final chapter's end?
When it all starts over
How do these scars begin to mend?

Centaurworld

I moved apartments the other week, on some philosopher's birthday or the anniversary of a national tragedy, to a nice studio in a nice neighborhood back on the correct side of the Caldecott Tunnel (now that I've learned my lesson about which side of the tunnel is correct).

I had been making noises about leaving Berkeley for a while, but kept not getting around to it until my hand was forced by my roommate moving out. Insofar as I was complaining about the political culture, you might think that I should have fled the Bay entirely, to a different region which might have different kinds of people. Reno, probably. Or Austin (which may be the Berkeley of Texas, but at least it's the Berkeley of Texas).

I don't think a longer move was necessary. I mostly live on the internet, anyway: insofar as "Berkeley" was a metonym for the egregore, it's unclear how much leaving the literal city would help.

Although—it may not be entirely a coincidence that I feel better having left the literal city? Moving is a Schelling point for new beginnings, new habits. The sense that my life is over hasn't fully gone away, but now I have more hope in finding meaning and not just pleasure in this afterlife while it lasts, perhaps fueled by regret-based superpowers.

I'm happy. I have a lot of writing to do.


In my old neighborhood in the part of Berkeley that's secretly Oakland (the city limits forming a penninsula just around my apartment), there used to be a "free store" on the corner—shelves for people to leave unwanted consumer goods and to take them to a good home. It stopped being a thing shortly before I left, due to some combination of adverse attention from city municipal code inspectors, and a fire.

In memoriam, there was a butcher-paper sign on the fence with a pen on a string, asking community members to write a note on what the free store had meant to them.

One of the messages read:

i'm a (very broke) trans woman
and i don't often feel great about
my body, but there are a few items
that i found here that fit me in a way
thats very affirming to me

There are so many questions (of the rhetorical or probing varieties) I could ask of my neighbor who wrote that message. (Why mention being trans at all? Don't many cis women often not feel great about their bodies? What do you think are the differences between you and a man who might have written a message starting "I'm a (very broke) transvestite"? Or is it just that such a man's sense of public decency would bid him keep quiet ... or, just possibly, write a message more like yours?)

But—not my neighbor.

I don't live there anymore.


A Hill of Validity in Defense of Meaning

If you are silent about your pain, they'll kill you and say you enjoyed it.

—Zora Neale Hurston

Recapping my Whole Dumb Story so far—in a previous post, "Sexual Dimorphism in Yudkowsky's Sequences, in Relation to My Gender Problems", I told the part about how I've "always" (since puberty) had this obsessive sexual fantasy about being magically transformed into a woman and also thought it was immoral to believe in psychological sex differences, until I got set straight by these really great Sequences of blog posts by Eliezer Yudkowsky, which taught me (incidentally, among many other things) how absurdly unrealistic my obsessive sexual fantasy was given merely human-level technology, and that it's actually immoral not to believe in psychological sex differences given that psychological sex differences are actually real. In a subsequent post, "Blanchard's Dangerous Idea and the Plight of the Lucid Crossdreamer", I told the part about how, in 2016, everyone in my systematically-correct-reasoning community up to and including Eliezer Yudkowsky suddenly started claiming that guys like me might actually be women in some unspecified metaphysical sense and insisted on playing dumb when confronted with alternative explanations of the relevant phenomena, until I eventually had a sleep-deprivation- and stress-induced delusional nervous breakdown.

That's not the egregious part of the story. Psychology is a complicated empirical science: no matter how obvious I might think something is, I have to admit that I could be wrong—not just as an obligatory profession of humility, but actually wrong in the real world. If my fellow rationalists merely weren't sold on the thesis about autogynephilia as a cause of transsexuality, I would be disappointed, but it wouldn't be grounds to denounce the entire community as a failure or a fraud. And indeed, I did end up moderating my views compared to the extent to which my thinking in 2016–7 took the views of Ray Blanchard, J. Michael Bailey, and Anne Lawrence as received truth. (At the same time, I don't particularly regret saying what I said in 2016–7, because Blanchard–Bailey–Lawrence is still obviously directionally correct compared to the nonsense everyone else was telling me.)

But a striking pattern in my attempts to argue with people about the two-type taxonomy in late 2016 and early 2017 was the tendency for the conversation to get derailed on some variation of, "Well, the word woman doesn't necessarily mean that," often with a link to "The Categories Were Made for Man, Not Man for the Categories", a November 2014 post by Scott Alexander arguing that because categories exist in our model of the world rather than the world itself, there's nothing wrong with simply defining trans people as their preferred gender to alleviate their dysphoria.

After Yudkowsky had stepped away from full-time writing, Alexander had emerged as our subculture's preeminent writer. Most people in an intellectual scene "are writers" in some sense, but Alexander was the one "everyone" reads: you could often reference a Slate Star Codex post in conversation and expect people to be familiar with the idea, either from having read it, or by osmosis. The frequency with which "... Not Man for the Categories" was cited at me seemed to suggest it had become our subculture's party line on trans issues.

But the post is wrong in obvious ways. To be clear, it's true that categories exist in our model of the world, rather than the world itself—categories are "map", not "territory"—and it's possible that trans women might be women with respect to some genuinely useful definition of the word "woman." However, Alexander goes much further, claiming that we can redefine gender categories to make trans people feel better:

I ought to accept an unexpected man or two deep inside the conceptual boundaries of what would normally be considered female if it'll save someone's life. There's no rule of rationality saying that I shouldn't, and there are plenty of rules of human decency saying that I should.

This is wrong because categories exist in our model of the world in order to capture empirical regularities in the world itself: the map is supposed to reflect the territory, and there are "rules of rationality" governing what kinds of word and category usages correspond to correct probabilistic inferences. Yudkowsky had written a whole Sequence about this, "A Human's Guide to Words". Alexander cites a post from that Sequence in support of the (true) point about how categories are "in the map" ... but if you actually read the Sequence, another point that Yudkowsky pounds home over and over, is that word and category definitions are nevertheless not arbitrary: you can't define a word any way you want, because there are at least 37 ways that words can be wrong—principles that make some definitions perform better than others as "cognitive technology."

In the case of Alexander's bogus argument about gender categories, the relevant principle (#30 on the list of 37) is that if you group things together in your map that aren't actually similar in the territory, you're going to make bad inferences.

Crucially, this is a general point about how language itself works that has nothing to do with gender. No matter what you believe about controversial empirical questions, intellectually honest people should be able to agree that "I ought to accept an unexpected [X] or two deep inside the conceptual boundaries of what would normally be considered [Y] if [positive consequence]" is not the correct philosophy of language, independently of the particular values of X and Y.

This wasn't even what I was trying to talk to people about. I thought I was trying to talk about autogynephilia as an empirical theory of psychology of late-onset gender dysphoria in males, the truth or falsity of which cannot be altered by changing the meanings of words. But at this point, I still trusted people in my robot cult to be basically intellectually honest, rather than slaves to their political incentives, so I endeavored to respond to the category-boundary argument under the assumption that it was an intellectually serious argument that someone could honestly be confused about.

When I took a year off from dayjobbing from March 2017 to March 2018 to have more time to study and work on this blog, the capstone of my sabbatical was an exhaustive response to Alexander, "The Categories Were Made for Man to Make Predictions" (which Alexander graciously included in his next links post). A few months later, I followed it with "Reply to The Unit of Caring on Adult Human Females", responding to a similar argument from soon-to-be Vox journalist Kelsey Piper, then writing as The Unit of Caring on Tumblr.

I'm proud of those posts. I think Alexander's and Piper's arguments were incredibly dumb, and that with a lot of effort, I did a pretty good job of explaining why to anyone who was interested and didn't, at some level, prefer not to understand.

Of course, a pretty good job of explaining by one niche blogger wasn't going to put much of a dent in the culture, which is the sum of everyone's blogposts; despite the mild boost from the Slate Star Codex links post, my megaphone just wasn't very big. I was disappointed with the limited impact of my work, but not to the point of bearing much hostility to "the community." People had made their arguments, and I had made mine; I didn't think I was entitled to anything more than that.

Really, that should have been the end of the story. Not much of a story at all. If I hadn't been further provoked, I would have still kept up this blog, and I still would have ended up arguing about gender with people sometimes, but this personal obsession wouldn't have been the occasion of a robot-cult religious civil war involving other people whom you'd expect to have much more important things to do with their time.

The casus belli for the religious civil war happened on 28 November 2018. I was at my new dayjob's company offsite event in Austin, Texas. Coincidentally, I had already spent much of the previous two days (since just before the plane to Austin took off) arguing trans issues with other "rationalists" on Discord.

Just that month, I had started a Twitter account using my real name, inspired in an odd way by the suffocating wokeness of the Rust open-source software scene where I occasionally contributed diagnostics patches to the compiler. My secret plan/fantasy was to get more famous and established in the Rust world (one of compiler team membership, or conference talk accepted, preferably both), get some corresponding Twitter followers, and then bust out the @BlanchardPhd retweets and links to this blog. In the median case, absolutely nothing would happen (probably because I failed at being famous), but I saw an interesting tail of scenarios in which I'd get to be a test case in the Code of Conduct wars.

So, now having a Twitter account, I was browsing Twitter in the bedroom at the rental house for the dayjob retreat when I happened to come across this thread by @ESYudkowsky:

Some people I usually respect for their willingness to publicly die on a hill of facts, now seem to be talking as if pronouns are facts, or as if who uses what bathroom is necessarily a factual statement about chromosomes. Come on, you know the distinction better than that!

Even if somebody went around saying, "I demand you call me 'she' and furthermore I claim to have two X chromosomes!", which none of my trans colleagues have ever said to me by the way, it still isn't a question-of-empirical-fact whether she should be called "she". It's an act.

In saying this, I am not taking a stand for or against any Twitter policies. I am making a stand on a hill of meaning in defense of validity, about the distinction between what is and isn't a stand on a hill of facts in defense of truth.

I will never stand against those who stand against lies. But changing your name, asking people to address you by a different pronoun, and getting sex reassignment surgery, Is. Not. Lying. You are ontologically confused if you think those acts are false assertions.

Some of the replies tried to explain the obvious problem—and Yudkowsky kept refusing to understand:

Using language in a way you dislike, openly and explicitly and with public focus on the language and its meaning, is not lying. The proposition you claim false (chromosomes?) is not what the speech is meant to convey—and this is known to everyone involved, it is not a secret.

Now, maybe as a matter of policy, you want to make a case for language being used a certain way. Well, that's a separate debate then. But you're not making a stand for Truth in doing so, and your opponents aren't tricking anyone or trying to.

repeatedly:

You're mistaken about what the word means to you, I demonstrate thus: https://en.wikipedia.org/wiki/XX_male_syndrome

But even ignoring that, you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning.

Dear reader, this is the moment where I flipped out. Let me explain.

This "hill of meaning in defense of validity" proclamation was such a striking contrast to the Eliezer Yudkowsky I remembered—the Eliezer Yudkowsky I had variously described as having "taught me everything I know" and "rewritten my personality over the internet"—who didn't hesitate to criticize uses of language that he thought were failing to "carve reality at the joints", even going so far as to call them "wrong":

[S]aying "There's no way my choice of X can be 'wrong'" is nearly always an error in practice, whatever the theory. You can always be wrong. Even when it's theoretically impossible to be wrong, you can still be wrong. There is never a Get-Out-Of-Jail-Free card for anything you do. That's life.

Similarly:

Once upon a time it was thought that the word "fish" included dolphins. Now you could play the oh-so-clever arguer, and say, "The list: {Salmon, guppies, sharks, dolphins, trout} is just a list—you can't say that a list is wrong. I can prove in set theory that this list exists. So my definition of fish, which is simply this extensional list, cannot possibly be 'wrong' as you claim."

Or you could stop playing nitwit games and admit that dolphins don't belong on the fish list.

You come up with a list of things that feel similar, and take a guess at why this is so. But when you finally discover what they really have in common, it may turn out that your guess was wrong. It may even turn out that your list was wrong.

You cannot hide behind a comforting shield of correct-by-definition. Both extensional definitions and intensional definitions can be wrong, can fail to carve reality at the joints.

One could argue that this "Words can be wrong when your definition draws a boundary around things that don't really belong together" moral didn't apply to Yudkowsky's new Tweets, which only mentioned pronouns and bathroom policies, not the extensions of common nouns.

But this seems pretty unsatisfying in the context of Yudkowsky's claim to "not [be] taking a stand for or against any Twitter policies". One of the Tweets that had recently led to radical feminist Meghan Murphy getting kicked off the platform read simply, "Men aren't women tho." This doesn't seem like a policy claim; rather, Murphy was using common language to express the fact-claim that members of the natural category of adult human males, are not, in fact, members of the natural category of adult human females.

Thus, if the extension of common words like "woman" and "man" is an issue of epistemic importance that rationalists should care about, then presumably so was Twitter's anti-misgendering policy—and if it isn't (because you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning) then I wasn't sure what was left of the "Human's Guide to Words" Sequence if the 37-part grand moral needed to be retracted.

I think I am standing in defense of truth when I have an argument for why my preferred word usage does a better job at carving reality at the joints, and the one bringing my usage explicitly into question does not. As such, I didn't see the practical difference between "you're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning," and "I can define a word any way I want." About which, again, an earlier Eliezer Yudkowsky had written:

"It is a common misconception that you can define a word any way you like. [...] If you believe that you can 'define a word any way you like', without realizing that your brain goes on categorizing without your conscious oversight, then you won't take the effort to choose your definitions wisely."

"So that's another reason you can't 'define a word any way you like': You can't directly program concepts into someone else's brain."

"When you take into account the way the human mind actually, pragmatically works, the notion 'I can define a word any way I like' soon becomes 'I can believe anything I want about a fixed set of objects' or 'I can move any object I want in or out of a fixed membership test'."

"There's an idea, which you may have noticed I hate, that 'you can define a word any way you like'."

"And of course you cannot solve a scientific challenge by appealing to dictionaries, nor master a complex skill of inquiry by saying 'I can define a word any way I like'."

"Categories are not static things in the context of a human brain; as soon as you actually think of them, they exert force on your mind. One more reason not to believe you can define a word any way you like."

"And people are lazy. They'd rather argue 'by definition', especially since they think 'you can define a word any way you like'."

"And this suggests another—yes, yet another—reason to be suspicious of the claim that 'you can define a word any way you like'. When you consider the superexponential size of Conceptspace, it becomes clear that singling out one particular concept for consideration is an act of no small audacity—not just for us, but for any mind of bounded computing power."

"I say all this, because the idea that 'You can X any way you like' is a huge obstacle to learning how to X wisely. 'It's a free country; I have a right to my own opinion' obstructs the art of finding truth. 'I can define a word any way I like' obstructs the art of carving reality at its joints. And even the sensible-sounding 'The labels we attach to words are arbitrary' obstructs awareness of compactness."

"One may even consider the act of defining a word as a promise to [the] effect [...] [that the definition] will somehow help you make inferences / shorten your messages."

One could argue that I was unfairly interpreting Yudkowsky's Tweets as having a broader scope than was intended—that Yudkowsky only meant to slap down the false claim that using he for someone with a Y chromosome is "lying", without intending any broader implications about trans issues or the philosophy of language. It wouldn't be realistic or fair to expect every public figure to host an exhaustive debate on all related issues every time they encounter a fallacy they want to Tweet about.

However, I don't think this "narrow" reading is the most natural one. Yudkowsky had previously written of what he called the fourth virtue of evenness: "If you are selective about which arguments you inspect for flaws, or how hard you inspect for flaws, then every flaw you learn how to detect makes you that much stupider." He had likewise written on reversed stupidity (bolding mine):

To argue against an idea honestly, you should argue against the best arguments of the strongest advocates. Arguing against weaker advocates proves nothing, because even the strongest idea will attract weak advocates.

Relatedly, Scott Alexander had written about how "weak men are superweapons": speakers often selectively draw attention to the worst arguments in favor of a position in an attempt to socially discredit people who have better arguments (which the speaker ignores). In the same way, by just slapping down a weak man from the "anti-trans" political coalition without saying anything else in a similarly prominent location, Yudkowsky was liable to mislead his faithful students into thinking that there were no better arguments from the "anti-trans" side.

To be sure, it imposes a cost on speakers to not be able to Tweet about one specific annoying fallacy and then move on with their lives without the need for endless disclaimers about related but stronger arguments that they're not addressing. But the fact that Yudkowsky disclaimed that he wasn't taking a stand for or against Twitter's anti-misgendering policy demonstrates that he didn't have an aversion to spending a few extra words to prevent the most common misunderstandings.

Given that, it's hard to read the Tweets Yudkowsky published as anything other than an attempt to intimidate and delegitimize people who want to use language to reason about sex rather than gender identity. It's just not plausible that Yudkowsky was simultaneously savvy enough to choose to make these particular points while also being naïve enough to not understand the political context. Deeper in the thread, he wrote:

The more technology advances, the further we can move people towards where they say they want to be in sexspace. Having said this we've said all the facts. Who competes in sports segregated around an Aristotelian binary is a policy question (that I personally find very humorous).

Sure, in the limit of arbitrarily advanced technology, everyone could be exactly where they wanted to be in sexpsace. Having said this, we have not said all the facts relevant to decisionmaking in our world, where we do not have arbitrarily advanced technology (as Yudkowsky well knew, having written a post about how technically infeasible an actual sex change would be). As Yudkowsky acknowledged in the previous Tweet, "Hormone therapy changes some things and leaves others constant." The existence of hormone replacement therapy does not itself take us into the glorious transhumanist future where everyone is the sex they say they are.

The reason for sex-segregated sports leagues is that sport-relevant multivariate trait distributions of female bodies and male bodies are different: men are taller, stronger, and faster. If you just had one integrated league, females wouldn't be competitive (in the vast majority of sports, with a few exceptions like ultra-distance swimming that happen to sample an unusually female-favorable corner of sportspace).

Given the empirical reality of the different trait distributions, "Who are the best athletes among females?" is a natural question for people to be interested in and want separate sports leagues to determine. Including male people in female sports leagues undermines the point of having a separate female league, and hormone replacement therapy after puberty doesn't substantially change the picture here.1

Yudkowsky's suggestion that an ignorant commitment to an "Aristotelian binary" is the main reason someone might care about the integrity of women's sports is an absurd strawman. This just isn't something any scientifically literate person would write if they had actually thought about the issue at all, as opposed to having first decided (consciously or not) to bolster their reputation among progressives by dunking on transphobes on Twitter, and then wielding their philosophy knowledge in the service of that political goal. The relevant facts are not subtle, even if most people don't have the fancy vocabulary to talk about them in terms of "multivariate trait distributions."

I'm picking on the "sports segregated around an Aristotelian binary" remark because sports is a case where the relevant effect sizes are so large as to make the point hard for all but the most ardent gender-identity partisans to deny. (For example, what the Cohen's d2.6 effect size difference in muscle mass means is that a woman as strong as the average man is at the 99.5th percentile for women.) But the point is general: biological sex exists and is sometimes decision-relevant. People who want to be able to talk about sex and make policy decisions on the basis of sex are not making an ontology error, because the ontology in which sex "actually" "exists" continues to make very good predictions in our current tech regime (if not the glorious transhumanist future). It would be a ridiculous isolated demand for rigor to expect someone to pass a graduate exam about the philosophy and cognitive science of categorization before they can talk about sex.

Thus, Yudkowsky's claim to merely have been standing up for the distinction between facts and policy questions doesn't seem credible. It is, of course, true that pronoun and bathroom conventions are policy decisions rather than matters of fact, but it's bizarre to condescendingly point this out as if it were the crux of contemporary trans-rights debates. Conservatives and gender-critical feminists know that trans-rights advocates aren't falsely claiming that trans women have XX chromosomes! If you just wanted to point out that the rules of sports leagues are a policy question rather than a fact (as if anyone had doubted this), why would you throw in the "Aristotelian binary" weak man and belittle the matter as "humorous"? There are a lot of issues I don't care much about, but I don't see anything funny about the fact that other people do care.2

If any concrete negative consequence of gender self-identity categories is going to be waved away with, "Oh, but that's a mere policy decision that can be dealt with on some basis other than gender, and therefore doesn't count as an objection to the new definition of gender words", then it's not clear what the new definition is for.

Like many gender-dysphoric males, I cosplay female characters at fandom conventions sometimes. And, unfortunately, like many gender-dysphoric males, I'm not very good at it. I think someone looking at some of my cosplay photos and trying to describe their content in clear language—not trying to be nice to anyone or make a point, but just trying to use language as a map that reflects the territory—would say something like, "This is a photo of a man and he's wearing a dress." The word man in that sentence is expressing cognitive work: it's a summary of the lawful cause-and-effect evidential entanglement whereby the photons reflecting off the photograph are correlated with photons reflecting off my body at the time the photo was taken, which are correlated with my externally observable secondary sex characteristics (facial structure, beard shadow, &c.). From this evidence, an agent using an efficient naïve-Bayes-like model can assign me to its "man" (adult human male) category and thereby make probabilistic predictions about traits that aren't directly observable from the photo. The agent would achieve a better score on those predictions than if it had assigned me to its "woman" (adult human female) category.

By "traits" I mean not just sex chromosomes (as Yudkowsky suggested on Twitter), but the conjunction of dozens or hundreds of measurements that are causally downstream of sex chromosomes: reproductive organs and muscle mass (again, sex difference effect size of Cohen's d ≈ 2.6) and Big Five Agreeableness (d ≈ 0.5) and Big Five Neuroticism (d ≈ 0.4) and short-term memory (d ≈ 0.2, favoring women) and white-gray-matter ratios in the brain and probable socialization history and any number of other things—including differences we might not know about, but have prior reasons to suspect exist. No one knew about sex chromosomes before 1905, but given the systematic differences between women and men, it would have been reasonable to suspect the existence of some sort of molecular mechanism of sex determination.

Forcing a speaker to say "trans woman" instead of "man" in a sentence about my cosplay photos depending on my verbally self-reported self-identity may not be forcing them to lie, exactly. It's understood, "openly and explicitly and with public focus on the language and its meaning," what trans women are; no one is making a false-to-fact claim about them having ovaries, for example. But it is forcing the speaker to obfuscate the probabilistic inference they were trying to communicate with the original sentence (about modeling the person in the photograph as being sampled from the "man" cluster in configuration space), and instead use language that suggests a different cluster-structure. ("Trans women", two words, are presumably a subcluster within the "women" cluster.) Crowing in the public square about how people who object to being forced to "lie" must be ontologically confused is ignoring the interesting part of the problem. Gender identity's claim to be non-disprovable functions as a way to avoid the belief's real weak points.

To this, one might reply that I'm giving too much credit to the "anti-trans" faction for how stupid they're not being: that my careful dissection of the hidden probabilistic inferences implied by words (including pronoun choices) is all well and good, but calling pronouns "lies" is not something you do when you know how to use words.

But I'm not giving them credit for for understanding the lessons of "A Human's Guide to Words"; I just think there's a useful sense of "know how to use words" that embodies a lower standard of philosophical rigor. If a person-in-the-street says of my cosplay photos, "That's a man! I have eyes, and I can see that that's a man! Men aren't women!"—well, I probably wouldn't want to invite them to a Less Wrong meetup. But I do think the person-in-the-street is performing useful cognitive work. Because I have the hidden-Bayesian-structure-of-language-and-cognition-sight (thanks to Yudkowsky's writings back in the 'aughts), I know how to sketch out the reduction of "Men aren't women" to something more like "This cognitive algorithm detects secondary sex characteristics and uses it as a classifier for a binary female/male 'sex' category, which it uses to make predictions about not-yet-observed features ..."

But having done the reduction-to-cognitive-algorithms, it still looks like the person-in-the-street has a point that I shouldn't be allowed to ignore just because I have 30 more IQ points and better philosophy-of-language skills?

I bring up my bad cosplay photos as an edge case that helps illustrate the problem I'm trying to point out, much like how people love to bring up complete androgen insensitivity syndrome to illustrate why "But chromosomes!" isn't the correct reduction of sex classification. To differentiate what I'm saying from blind transphobia, let me note that I predict that most people-in-the-street would be comfortable using feminine pronouns for someone like Blaire White. That's evidence about the kind of cognitive work people's brains are doing when they use English pronouns! Certainly, English is not the only language, and ours is not the only culture; maybe there is a way to do gender categories that would be more accurate and better for everyone. But to find what that better way is, we need to be able to talk about these kinds of details in public, and the attitude evinced in Yudkowsky's Tweets seemed to function as a semantic stopsign to get people to stop talking about the details.

If you were interested in having a real discussion (instead of a fake discussion that makes you look good to progressives), why would you slap down the "But, but, chromosomes" fallacy and then not engage with the obvious steelman of "But, but, clusters in high-dimensional configuration space that aren't actually changeable with contemporary technology" steelman which was, in fact, brought up in the replies?

Satire is a weak form of argument: the one who wishes to doubt will always be able to find some aspect in which an obviously absurd satirical situation differs from the real-world situation being satirized and claim that that difference destroys the relevance of the joke. But on the off chance that it might help illustrate the objection, imagine you lived in a so-called "rationalist" subculture where conversations like this happened—

⁕ ⁕ ⁕

Bob: Look at this adorable cat picture!

Alice: Um, that looks like a dog to me, actually.

Bob: You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning. Now, maybe as a matter of policy, you want to make a case for language being used a certain way. Well, that's a separate debate then.

⁕ ⁕ ⁕

If you were Alice, and a solid supermajority of your incredibly smart, incredibly philosophically sophisticated friend group including Eliezer Yudkowsky (!!!) seemed to behave like Bob, that would be a worrying sign about your friends' ability to accomplish intellectually hard things like AI alignment, right? Even if there isn't any pressing practical need to discriminate between dogs and cats, the problem is that Bob is selectively using his sophisticated philosophy-of-language knowledge to try to undermine Alice's ability to use language to make sense of the world, even though Bob obviously knows very well what Alice was trying to say. It's incredibly obfuscatory in a way that people—the same people—would not tolerate in almost any other context.

Imagine an Islamic theocracy in which one Megan Murfi (ميغان ميرفي) had recently gotten kicked off the dominant microblogging platform for speaking disrespectfully about the prophet Muhammad. Suppose that Yudkowsky's analogue in that world then posted that those objecting on free inquiry grounds were ontologically confused: saying "peace be upon him" after the name of the prophet Muhammad is a speech act, not a statement of fact. In banning Murfi for repeatedly speaking about the prophet Muhammad (peace be upon him) as if he were just some guy, the platform was merely "enforcing a courtesy standard" (in the words of our world's Yudkowsky). Murfi wasn't being forced to lie.

I think the atheists of our world, including Yudkowsky, would not have trouble seeing the problem with this scenario, nor hesitate to agree that it is a problem for that Society's rationality. Saying "peace be unto him" is indeed a speech act rather than a statement of fact, but it would be bizarre to condescendingly point this out as if it were the crux of debates about religious speech codes. The function of the speech act is to signal the speaker's affirmation of Muhammad's divinity. That's why the Islamic theocrats want to mandate that everyone say it: it's a lot harder for atheism to get any traction if no one is allowed to talk like an atheist.

And that's why trans advocates want to mandate against misgendering people on social media: it's harder for trans-exclusionary ideologies to get any traction if no one is allowed to talk like someone who believes that sex (sometimes) matters and gender identity does not.

Of course, such speech restrictions aren't necessarily "irrational", depending on your goals. If you just don't think "free speech" should go that far—if you want to suppress atheism or gender-critical feminism with an iron fist—speech codes are a perfectly fine way to do it! And to their credit, I think most theocrats and trans advocates are intellectually honest about what they're doing: atheists or transphobes are bad people (the argument goes) and we want to make it harder for them to spread their lies or their hate.

In contrast, by claiming to be "not taking a stand for or against any Twitter policies" while accusing people who opposed the policy of being ontologically confused, Yudkowsky was being less honest than the theocrat or the activist: of course the point of speech codes is to suppress ideas! Given that the distinction between facts and policies is so obviously not anyone's crux—the smarter people in the "anti-trans" faction already know that, and the dumber people in the faction wouldn't change their alignment if they were taught—it's hard to see what the point of harping on the fact/policy distinction would be, except to be seen as implicitly taking a stand for the "pro-trans" faction while putting on a show of being politically "neutral."

It makes sense that Yudkowsky might perceive political constraints on what he might want to say in public—especially when you look at what happened to the other Harry Potter author.3 But if Yudkowsky didn't want to get into a distracting fight about a politically-charged topic, then maybe the responsible thing to do would have been to just not say anything about the topic, rather than engaging with the stupid version of the opposition and stonewalling with "That's a policy question" when people tried to point out the problem?!


I didn't have all of that criticism collected and carefully written up on 28 November 2018. But that, basically, is why I flipped out when I saw that Twitter thread. If the "rationalists" didn't click on the autogynephilia thing, that was disappointing, but forgivable. If the "rationalists", on Scott Alexander's authority, were furthermore going to get our own philosophy of language wrong over this, that was—I don't want to say forgivable exactly, but it was tolerable. I had learned from my misadventures the previous year that I had been wrong to trust "the community" as a reified collective. That had never been a reasonable mental stance in the first place.

But trusting Eliezer Yudkowsky—whose writings, more than any other single influence, had made me who I am—did seem reasonable. If I put him on a pedestal, it was because he had earned the pedestal, for supplying me with my criteria for how to think—including, as a trivial special case, how to think about what things to put on pedestals.

So if the rationalists were going to get our own philosophy of language wrong over this and Eliezer Yudkowsky was in on it (!!!), that was intolerable, inexplicable, incomprehensible—like there wasn't a real world anymore.

At the dayjob retreat, I remember going downstairs to impulsively confide in a senior engineer, an older bald guy who exuded masculinity, who you could tell by his entire manner and being was not infected by the Berkeley mind-virus, no matter how loyally he voted Democrat. I briefly explained the situation to him—not just the immediate impetus of this Twitter thread, but this whole thing of the past couple years where my entire social circle just suddenly decided that guys like me could be women by means of saying so. He was noncommittally sympathetic; he told me an anecdote about him accepting a trans person's correction of his pronoun usage, with the thought that different people have their own beliefs, and that's OK.

If Yudkowsky was already stonewalling his Twitter followers, entering the thread myself didn't seem likely to help. (Also, less importantly, I hadn't intended to talk about gender on that account yet.)

It seemed better to try to clear this up in private. I still had Yudkowsky's email address, last used when I had offered to pay to talk about his theory of MtF two years before. I felt bad bidding for his attention over my gender thing again—but I had to do something. Hands trembling, I sent him an email asking him to read my "The Categories Were Made for Man to Make Predictions", suggesting that it might qualify as an answer to his question about "a page [he] could read to find a non-confused exclamation of how there's scientific truth at stake". I said that because I cared very much about correcting confusions in my rationalist subculture, I would be happy to pay up to $1000 for his time—and that, if he liked the post, he might consider Tweeting a link—and that I was cc'ing my friends Anna Salamon and Michael Vassar as character references (Subject: "another offer, $1000 to read a ~6500 word blog post about (was: Re: Happy Price offer for a 2 hour conversation)"). Then I texted Anna and Michael, begging them to vouch for my credibility.

The monetary offer, admittedly, was awkward: I included another paragraph clarifying that any payment was only to get his attention, not quid quo pro advertising, and that if he didn't trust his brain circuitry not to be corrupted by money, then he might want to reject the offer on those grounds and only read the post if he expected it to be genuinely interesting.

Again, I realize this must seem weird and cultish to any normal people reading this. (Paying some blogger you follow one grand just to read one of your posts? What? Why? Who does that?) To this, I again refer to the reasons justifying my 2016 cheerful price offer—and that, along with tagging in Anna and Michael, whom I thought Yudkowsky respected, it was a way to signal that I really didn't want to be ignored, which I assumed was the default outcome. An ordinary programmer such as me was as a mere worm in the presence of the great Eliezer Yudkowsky. I wouldn't have had the audacity to contact him at all, about anything, if I didn't have Something to Protect.

Anna didn't reply, but I apparently did interest Michael, who chimed in on the email thread to Yudkowsky. We had a long phone conversation the next day lamenting how the "rationalists" were dead as an intellectual community.

As for the attempt to intervene on Yudkowsky—here I need to make a digression about the constraints I'm facing in telling this Whole Dumb Story. I would prefer to just tell this Whole Dumb Story as I would to my long-neglected Diary—trying my best at the difficult task of explaining what actually happened during an important part of my life, without thought of concealing anything.

(If you are silent about your pain, they'll kill you and say you enjoyed it.)

Unfortunately, a lot of other people seem to have strong intuitions about "privacy", which bizarrely impose constraints on what I'm allowed to say about my own life: in particular, it's considered unacceptable to publicly quote or summarize someone's emails from a conversation that they had reason to expect to be private. I feel obligated to comply with these widely-held privacy norms, even if I think they're paranoid and anti-social. (This secrecy-hating trait probably correlates with the autogynephilia blogging; someone otherwise like me who believed in privacy wouldn't be telling you this Whole Dumb Story.)

So I would think that while telling this Whole Dumb Story, I obviously have an inalienable right to blog about my own actions, but I'm not allowed to directly refer to private conversations with named individuals in cases where I don't think I'd be able to get the consent of the other party. (I don't think I'm required to go through the ritual of asking for consent in cases where the revealed information couldn't reasonably be considered "sensitive", or if I know the person doesn't have hangups about this weird "privacy" thing.) In this case, I'm allowed to talk about emailing Yudkowsky (because that was my action), but I'm not allowed to talk about anything he might have said in reply, or whether he did.

Unfortunately, there's a potentially serious loophole in the commonsense rule: what if some of my actions (which I would have hoped to have an inalienable right to blog about) depend on content from private conversations? You can't, in general, only reveal one side of a conversation.

Suppose Carol messages Dave at 5 p.m., "Can you come to the party?", and also, separately, that Carol messages Dave at 6 p.m., "Gout isn't contagious." Should Carol be allowed to blog about the messages she sent at 5 p.m. and 6 p.m., because she's only describing her own messages and not confirming or denying whether Dave replied at all, let alone quoting him?

I think commonsense privacy-norm-adherence intuitions actually say No here: the text of Carol's messages makes it too easy to guess that sometime between 5 and 6, Dave probably said that he couldn't come to the party because he has gout. It would seem that Carol's right to talk about her own actions in her own life does need to take into account some commonsense judgement of whether that leaks "sensitive" information about Dave.

In the substory (of my Whole Dumb Story) that follows, I'm going to describe several times that I and others emailed Yudkowsky to argue with what he said in public, without saying anything about whether Yudkowsky replied or what he might have said if he did reply. I maintain that I'm within my rights here, because I think commonsense judgment will agree that me talking about the arguments I made does not leak any sensitive information about the other side of a conversation that may or may not have happened. I think the story comes off relevantly the same whether Yudkowsky didn't reply at all (e.g., because he was too busy with more existentially important things to check his email), or whether he replied in a way that I found sufficiently unsatisfying as to occasion the further emails with followup arguments that I describe. (Talking about later emails does rule out the possible world where Yudkowsky had said, "Please stop emailing me," because I would have respected that, but the fact that he didn't say that isn't "sensitive".)

It seems particularly important to lay out these judgments about privacy norms in connection to my attempts to contact Yudkowsky, because part of what I'm trying to accomplish in telling this Whole Dumb Story is to deal reputational damage to Yudkowsky, which I claim is deserved. (We want reputations to track reality. If you see Erin exhibiting a pattern of intellectual dishonesty, and she keeps doing it even after you talk to her about it privately, you might want to write a blog post describing the pattern in detail—not to hurt Erin, particularly, but so that everyone else can make higher-quality decisions about whether they should believe the things that Erin says.) Given that motivation of mine, it seems important that I only try to hang Yudkowsky with the rope of what he said in public, where you can click the links and read the context for yourself: I'm attacking him, but not betraying him. In the substory that follows, I also describe correspondence with Scott Alexander, but that doesn't seem sensitive in the same way, because I'm not particularly trying to deal reputational damage to Alexander. (Not because Scott performed well, but because one wouldn't really have expected him to in this situation; Alexander's reputation isn't so direly in need of correction.)

Thus, I don't think I should say whether Yudkowsky replied to Michael's and my emails, nor (again) whether he accepted the cheerful-price money, because any conversation that may or may not have occurred would have been private. But what I can say, because it was public, is that we saw this addition to the Twitter thread:

I was sent this (by a third party) as a possible example of the sort of argument I was looking to read: http://unremediatedgender.space/2018/Feb/the-categories-were-made-for-man-to-make-predictions/. Without yet judging its empirical content, I agree that it is not ontologically confused. It's not going "But this is a MAN so using 'she' is LYING."

Look at that! The great Eliezer Yudkowsky said that my position is "not ontologically confused." That's probably high praise, coming from him!

You might think that that should have been the end of the story. Yudkowsky denounced a particular philosophical confusion, I already had a related objection written up, and he publicly acknowledged my objection as not being the confusion he was trying to police. I should be satisfied, right?

I wasn't, in fact, satisfied. This little "not ontologically confused" clarification buried deep in the replies was much less visible than the bombastic, arrogant top-level pronouncement insinuating that resistance to gender-identity claims was confused. (1 Like on this reply, vs. 140 Likes/18 Retweets on start of thread.) This little follow-up did not seem likely to disabuse the typical reader of the impression that Yudkowsky thought gender-identity skeptics didn't have a leg to stand on. Was it greedy of me to want something louder?

Greedy or not, I wasn't done flipping out. On 1 December 2019, I wrote to Scott Alexander (cc'ing a few other people) to ask if there was any chance of an explicit and loud clarification or partial retraction of "... Not Man for the Categories" (Subject: "super-presumptuous mail about categorization and the influence graph"). Forget my boring whining about the autogynephilia/two-types thing, I said—that's a complicated empirical claim, and not the key issue.

The issue was that category boundaries are not arbitrary (if you care about intelligence being useful). You want to draw your category boundaries such that things in the same category are similar in the respects that you care about predicting/controlling, and you want to spend your information-theoretically limited budget of short words on the simplest and most widely useful categories.

It was true that the reason I was continuing to freak out about this to the extent of sending him this obnoxious email telling him what to write (seriously, who does that?!) was because of transgender stuff, but that wasn't why Scott should care.

The other year, Alexander had written a post, "Kolmogorov Complicity and the Parable of Lightning", explaining the consequences of political censorship with an allegory about a Society with the dogma that thunder occurs before lightning.4 Alexander had explained that the problem with complying with the dictates of a false orthodoxy wasn't the sacred dogma itself (it's not often that you need to directly make use of the fact that lightning comes first), but that the need to defend the sacred dogma destroys everyone's ability to think.

It was the same thing here. It wasn't that I had any practical need to misgender anyone in particular. It still wasn't okay that talking about the reality of biological sex to so-called "rationalists" got you an endless deluge of—polite! charitable! non-ostracism-threatening!—bullshit nitpicking. (What about complete androgen insensitivity syndrome? Why doesn't this ludicrous misinterpretation of what you said imply that lesbians aren't women? &c. ad infinitum.) With enough time, I thought the nitpicks could and should be satisfactorily answered; any remaining would presumably be fatal criticisms rather than bullshit nitpicks. But while I was in the process of continuing to write all that up, I hoped Alexander could see why I felt somewhat gaslighted.

(I had been told by others that I wasn't using the word "gaslighting" correctly. No one seemed to think I had the right to define that category boundary for my convenience.)

If our vaunted rationality techniques resulted in me having to spend dozens of hours patiently explaining why I didn't think that I was a woman (where "not a woman" is a convenient rhetorical shorthand for a much longer statement about naïve Bayes models and high-dimensional configuration spaces and defensible Schelling points for social norms), then our techniques were worse than useless.

If Galileo ever muttered "And yet it moves", there's a long and nuanced conversation you could have about the consequences of using the word "moves" in Galileo's preferred sense, as opposed to some other sense that happens to result in the theory needing more epicycles. It may not have been obvious in November 2014 when "... Not Man for the Categories" was published, but in retrospect, maybe it was a bad idea to build a memetic superweapon that says that the number of epicycles doesn't matter.

The reason to write this as a desperate email plea to Scott Alexander instead of working on my own blog was that I was afraid that marketing is a more powerful force than argument. Rather than good arguments propagating through the population of so-called "rationalists" no matter where they arose, what actually happened was that people like Alexander and Yudkowsky rose to power on the strength of good arguments and entertaining writing (but mostly the latter), and then everyone else absorbed some of their worldview (plus noise and conformity with the local environment). So for people who didn't win the talent lottery but thought they saw a flaw in the zeitgeist, the winning move was "persuade Scott Alexander."

Back in 2010, the rationalist community had a shared understanding that the function of language is to describe reality. Now, we didn't. If Scott didn't want to cite my creepy blog about my creepy fetish, that was fine; I liked getting credit, but the important thing was that this "No, the Emperor isn't naked—oh, well, we're not claiming that he's wearing any garments—it would be pretty weird if we were claiming that!—it's just that utilitarianism implies that the social property of clothedness should be defined this way because to do otherwise would be really mean to people who don't have anything to wear" maneuver needed to die, and he alone could kill it.

Scott didn't get it. We agreed that gender categories based on self-identity, natal sex, and passing each had their own pros and cons, and that it's uninteresting to focus on whether something "really" belongs to a category rather than on communicating what you mean. Scott took this to mean that what convention to use is a pragmatic choice we can make on utilitarian grounds, and that being nice to trans people was worth a little bit of clunkiness—that the mental health benefits to trans people were obviously enough to tip the first-order utilitarian calculus.

I didn't think anything about "mental health benefits to trans people" was obvious. More importantly, I considered myself to be prosecuting not the object-level question of which gender categories to use but the meta-level question of what normative principles govern the use of categories. For this, "whatever, it's a pragmatic choice, just be nice" wasn't an answer, because the normative principles exclude "just be nice" from being a relevant consideration.

"... Not Man for the Categories" had concluded with a section on Emperor Norton, a 19th-century San Francisco resident who declared himself Emperor of the United States. Certainly, it's not difficult or costly for the citizens of San Francisco to address Norton as "Your Majesty". But there's more to being Emperor of the United States than what people call you. Unless we abolish Congress and have the military enforce Norton's decrees, he's not actually emperor—at least not according to the currently generally understood meaning of the word.

What are you going to do if Norton takes you literally? Suppose he says, "I ordered the Imperial Army to invade Canada last week; where are the troop reports? And why do the newspapers keep talking about this so-called 'President' Rutherford B. Hayes? Have this pretender Hayes executed at once and bring his head to me!"

You're not really going to bring him Rutherford B. Hayes's head. So what are you going to tell him? "Oh, well, you're not a cis emperor who can command executions. But don't worry! Trans emperors are emperors"?

To be sure, words can be used in many ways depending on context, but insofar as Norton is interpreting "emperor" in the traditional sense, and you keep calling him your emperor without caveats or disclaimers, you are lying to him.

Scott still didn't get it. But I did soon end up in more conversation with Michael Vassar, Ben Hoffman, and Sarah Constantin, who were game to help me reach out to Yudkowsky again to explain the problem in more detail—and to appeal to the conscience of someone who built their career on higher standards.

Yudkowsky probably didn't think much of Atlas Shrugged (judging by an offhand remark by our protagonist in Harry Potter and the Methods), but I kept thinking of the scene5 where our heroine, Dagny Taggart, entreats the great Dr. Robert Stadler to denounce an egregiously deceptive but technically-not-lying statement by the State Science Institute, whose legitimacy derives from its association with his name. Stadler has become cynical in his old age and demurs: "I can't help what people think—if they think at all!" ... "How can one deal in truth when one deals with the public?"

At this point, I still trusted Yudkowsky to do better than an Ayn Rand villain; I had faith that Eliezer Yudkowsky could deal in truth when he deals with the public.

(I was wrong.)

If we had this entire posse, I felt bad and guilty and ashamed about focusing too much on my special interest except insofar as it was genuinely a proxy for "Has Eliezer and/or everyone else lost the plot, and if so, how do we get it back?" But the group seemed to agree that my philosophy-of-language grievance was a useful test case.

At times, it felt like my mind shut down with only the thought, "What am I doing? This is absurd. Why am I running around picking fights about the philosophy of language—and worse, with me arguing for the Bad Guys' position? Maybe I'm wrong and should stop making a fool of myself. After all, using Aumann-like reasoning, in a dispute of 'me and Michael Vassar vs. everyone else', wouldn't I want to bet on 'everyone else'?"

Except ... I had been raised back in the 'aughts to believe that you're you're supposed to concede arguments on the basis of encountering a superior counterargument, and I couldn't actually point to one. "Maybe I'm making a fool out of myself by picking fights with all these high-status people" is not a counterargument.

Anna continued to be disinclined to take a side in the brewing Category War, and it was beginning to put a strain on our friendship, to the extent that I kept ending up crying during our occasional meetings. She said that my "You have to pass my philosophy-of-language litmus test or I lose all respect for you as a rationalist" attitude was psychologically coercive. I agreed—I was even willing to go up to "violent", in the sense that I'd cop to trying to apply social incentives toward an outcome rather than merely exchanging information. But sometimes you need to use violence in defense of self or property. If we thought of the "rationalist" brand name as intellectual property, maybe it was property worth defending, and if so, then "I can define a word any way I want" wasn't an obviously terrible time to start shooting at the bandits.

My hope was that it was possible to apply just enough "What kind of rationalist are you?!" social pressure to cancel out the "You don't want to be a Bad (Red) person, do you??" social pressure and thereby let people look at the arguments—though I wasn't sure if that even works, and I was growing exhausted from all the social aggression I was doing. (If someone tries to take your property and you shoot at them, you could be said to be the "aggressor" in the sense that you fired the first shot, even if you hope that the courts will uphold your property claim later.)

After some more discussion within the me/Michael/Ben/Sarah posse, on 4 January 2019, I wrote to Yudkowsky again (a second time), to explain the specific problems with his "hill of meaning in defense of validity" Twitter performance, since that apparently hadn't been obvious from the earlier link to "... To Make Predictions". I cc'ed the posse, who chimed in afterwards.

Ben explained what kind of actions we were hoping for from Yudkowsky: that he would (1) notice that he'd accidentally been participating in an epistemic war, (2) generalize the insight (if he hadn't noticed, what were the odds that MIRI had adequate defenses?), and (3) join the conversation about how to actually have a rationality community, while noticing this particular way in which the problem seemed harder than it used to. For my case in particular, something that would help would be either (A) a clear ex cathedra statement that gender categories are not an exception to the general rule that categories are nonarbitrary, or (B) a clear ex cathedra statement that he's been silenced on this matter. If even (B) was too politically expensive, that seemed like important evidence about (1).

Without revealing the other side of any private conversation that may or may not have occurred, I can say that we did not get either of those ex cathedra statements at this time.

It was also around this time that our posse picked up a new member, whom I'll call "Riley".


On 5 January 2019, I met with Michael and his associate Aurora Quinn-Elmore in San Francisco to attempt mediated discourse with Ziz and Gwen, who were considering suing the Center for Applied Rationality (CfAR)6 for discriminating against trans women. Michael hoped to dissuade them from a lawsuit—not because he approved of CfAR's behavior, but because lawyers make everything worse.

Despite our personality and worldview differences, I had had a number of cooperative interactions with Ziz a couple years before. We had argued about the etiology of transsexualism in late 2016. When I sent her some delusional PMs during my February 2017 psychotic break, she came over to my apartment with chocolate ("allegedly good against dementors"), although I wasn't there. I had awarded her $1200 as part of a credit-assignment ritual to compensate the twenty-one people who were most responsible for me successfully navigating my psychological crises of February and April 2017. (The fact that she had been up to argue about trans etiology meant a lot to me.) I had accepted some packages for her at my apartment in mid-2017 when she was preparing to live on a boat and didn't have a mailing address.

At this meeting, Ziz recounted her story of how Anna Salamon (in her capacity as President of CfAR and community leader) allegedly engaged in conceptual warfare to falsely portray Ziz as a predatory male. I was unimpressed: in my worldview, I didn't think Ziz had the right to say "I'm not a man," and expect people to just believe that. (I remember that at one point, Ziz answered a question with, "Because I don't run off masochistic self-doubt like you." I replied, "That's fair.") But I did respect that Ziz actually believed in an intersex brain theory: in Ziz and Gwen's worldview, people's genders were a fact of the matter, not a manipulation of consensus categories to make people happy.

Probably the most ultimately consequential part of this meeting was Michael verbally confirming to Ziz that MIRI had settled with a disgruntled former employee, Louie Helm, who had put up a website slandering them. (I don't know the details of the alleged settlement. I'm working off of Ziz's notes rather than remembering that part of the conversation clearly myself; I don't know what Michael knew.) What was significant was that if MIRI had paid Helm as part of an agreement to get the slanderous website taken down, then (whatever the nonprofit best-practice books might have said about whether this was a wise thing to do when facing a dispute from a former employee) that would decision-theoretically amount to a blackmail payout, which seemed to contradict MIRI's advocacy of timeless decision theories (according to which you shouldn't be the kind of agent that yields to extortion).


Something else Ben had said while chiming in on the second attempt to reach out to Yudkowsky hadn't sat quite right with me.

I am pretty worried that if I actually point out the physical injuries sustained by some of the smartest, clearest-thinking, and kindest people I know in the Rationalist community as a result of this sort of thing, I'll be dismissed as a mean person who wants to make other people feel bad.

I didn't know what he was talking about. My friend "Rebecca"'s 2015 psychiatric imprisonment ("hospitalization") had probably been partially related to her partner's transition and had involved rough handling by the cops. I had been through some Bad Stuff during my psychotic episodes of February and April 2017, but none of it was "physical injuries." What were the other cases, if he could share without telling me Very Secret Secrets With Names?

Ben said that, probabilistically, he expected that some fraction of the trans women he knew who had "voluntarily" had bottom surgery had done so in response to social pressure, even if some of them might well have sought it out in a less weaponized culture.

I said that saying, "I am worried that if I actually point out the physical injuries ..." when the actual example turned out to be sex reassignment surgery seemed dishonest: I had thought he might have more examples of situations like mine or "Rebecca"'s, where gaslighting escalated into more tangible harm in a way that people wouldn't know about by default. In contrast, people already know that bottom surgery is a thing; Ben just had reasons to think it's Actually Bad—reasons that his friends couldn't engage with if we didn't know what he was talking about. It was bad enough that Yudkowsky was being so cagey; if everyone did it, then we were really doomed.

Ben said he was more worried that saying politically loaded things in the wrong order would reduce our chances of getting engagement from Yudkowsky than that someone would share his words out of context in a way that caused him distinct harm. And maybe more than both of those, that saying the wrong keywords would cause his correspondents to talk about him using the wrong keywords, in ways that caused illegible, hard-to-trace damage.


There's a view that assumes that as long as everyone is being cordial, our truthseeking public discussion must be basically on track; the discussion is only being warped by the fear of heresy if someone is overtly calling to burn the heretics.

I do not hold this view. I think there's a subtler failure mode where people know what the politically favored bottom line is, and collude to ignore, nitpick, or just be uninterested in any fact or line of argument that doesn't fit. I want to distinguish between direct ideological conformity enforcement attempts, and people not living up to their usual epistemic standards in response to ideological conformity enforcement.

Especially compared to normal Berkeley, I had to give the Berkeley "rationalists" credit for being very good at free speech norms. (I'm not sure I would be saying this in the possible world where Scott Alexander didn't have a traumatizing experience with social justice in college, causing him to dump a ton of anti-social-justice, pro-argumentative-charity antibodies into the "rationalist" water supply after he became our subculture's premier writer. But it was true in our world.) I didn't want to fall into the bravery-debate trap of, "Look at me, I'm so heroically persecuted, therefore I'm right (therefore you should have sex with me)". I wasn't angry at the "rationalists" for silencing me (which they didn't); I was angry at them for making bad arguments and systematically refusing to engage with the obvious counterarguments.

As an illustrative example, in an argument on Discord in January 2019, I said, "I need the phrase 'actual women' in my expressive vocabulary to talk about the phenomenon where, if transition technology were to improve, then the people we call 'trans women' would want to make use of that technology; I need language that asymmetrically distinguishes between the original thing that already exists without having to try, and the artificial thing that's trying to imitate it to the limits of available technology".

Kelsey Piper replied, "the people getting surgery to have bodies that do 'women' more the way they want are mostly cis women [...] I don't think 'people who'd get surgery to have the ideal female body' cuts anything at the joints."

Another woman said, "'the original thing that already exists without having to try' sounds fake to me" (to the acclaim of four "+1" emoji reactions).

The problem with this kind of exchange is not that anyone is being shouted down, nor that anyone is lying. The problem is that people are motivatedly, "algorithmically" "playing dumb." I wish we had more standard terminology for this phenomenon, which is ubiquitous in human life. By "playing dumb", I don't mean that Kelsey was consciously thinking, "I'm playing dumb in order to gain an advantage in this argument." I don't doubt that, subjectively, mentioning that cis women also get cosmetic surgery felt like a relevant reply. It's just that, in context, I was obviously trying to talk about the natural category of "biological sex", and Kelsey could have figured that out if she had wanted to.

It's not that anyone explicitly said, "Biological sex isn't real" in those words. (The elephant in the brain knew it wouldn't be able to get away with that.) But if everyone correlatedly plays dumb whenever someone tries to talk about sex in clear language in a context where that could conceivably hurt some trans person's feelings, I think what you have is a culture of de facto biological sex denialism. ("'The original thing that already exists without having to try' sounds fake to me"!!) It's not that hard to get people to admit that trans women are different from cis women, but somehow they can't (in public, using words) follow the implication that trans women are different from cis women because trans women are male.

Ben thought I was wrong to see this behavior as non-ostracizing. The deluge of motivated nitpicking is an implied marginalization threat, he explained: the game people were playing when they did that was to force me to choose between doing arbitrarily large amounts of interpretive labor or being cast as never having answered these construed-as-reasonable objections, and therefore over time losing standing to make the claim, being thought of as unreasonable, not getting invited to events, &c.

I saw the dynamic he was pointing at, but as a matter of personality, I was more inclined to respond, "Welp, I guess I need to write faster and more clearly", rather than, "You're dishonestly demanding arbitrarily large amounts of interpretive labor from me." I thought Ben was far too quick to give up on people whom he modeled as trying not to understand, whereas I continued to have faith in the possibility of making them understand if I just didn't give up. Not to play chess with a pigeon (which craps on the board and then struts around like it's won), or wrestle with a pig (which gets you both dirty, and the pig likes it), or dispute what the Tortoise said to Achilles—but to hold out hope that people in "the community" could only be boundedly motivatedly dense, and anyway that giving up wouldn't make me a stronger writer.

(Picture me playing Hermione Granger in a post-Singularity holonovel adaptation of Harry Potter and the Methods of Rationality, Emma Watson having charged me the standard licensing fee to use a copy of her body for the occasion: "We can do anything if we exert arbitrarily large amounts of interpretive labor!")

Ben thought that making them understand was hopeless and that becoming a stronger writer was a boring goal; it would be a better use of my talents to jump up a meta level and explain how people were failing to engage. That is, insofar as I expected arguing to work, I had a model of "the rationalists" that kept making bad predictions. What was going on there? Something interesting might happen if I tried to explain that.

(I guess I'm only now, after spending an additional four years exhausting every possible line of argument, taking Ben's advice on this by finishing and publishing this memoir. Sorry, Ben—and thanks.)


One thing I regret about my behavior during this period was the extent to which I was emotionally dependent on my posse, and in some ways particularly Michael, for validation. I remembered Michael as a high-status community elder back in the Overcoming Bias era (to the extent that there was a "community" in those early days).7 I had been skeptical of him: the guy makes a lot of stridently "out there" assertions, in a way that makes you assume he must be speaking metaphorically. (He always insists he's being completely literal.) But he had social proof as the President of the Singularity Institute—the "people person" of our world-saving effort, to complement Yudkowsky's antisocial mad scientist personality—which inclined me to take his assertions more charitably than I otherwise would have.

Now, the memory of that social proof was a lifeline. Dear reader, if you've never been in the position of disagreeing with the entire weight of Society's educated opinion, including your idiosyncratic subculture that tells itself a story about being smarter and more open-minded than the surrounding Society—well, it's stressful. There was a comment on the /r/slatestarcodex subreddit around this time that cited Yudkowsky, Alexander, Piper, Ozy Brennan, and Rob Bensinger as leaders of the "rationalist" community. Just an arbitrary Reddit comment of no significance whatsoever—but it was a salient indicator of the zeitgeist to me, because every single one of those people had tried to get away with some variant on the "word usage is subjective, therefore you have no grounds to object to the claim that trans women are women" mind game.

In the face of that juggernaut of received opinion, I was already feeling pretty gaslighted. ("We ... we had a whole Sequence about this. And you were there, and you were there ... It—really happened, right? The hyperlinks still work ...") I don't know how I would have held up intact if I were facing it alone. I definitely wouldn't have had the impudence to pester Alexander and Yudkowsky—especially Yudkowsky—if it was just me against everyone else.

But Michael thought I was in the right—not just intellectually, but morally in the right to be prosecuting the philosophy issue with our leaders. That social proof gave me a lot of bravery that I otherwise wouldn't have been able to muster up—even though it would have been better if I could have internalized that my dependence on him was self-undermining, insofar as Michael himself said that what made me valuable was my ability to think independently.

The social proof was probably more effective in my head than with anyone we were arguing with. I remembered Michael as a high-status community elder back in the Overcoming Bias era, but that had been a long time ago. (Luke Muelhauser had taken over leadership of the Singularity Institute in 2011, and apparently, some sort of rift between Michael and Eliezer had widened in recent years.) Michael's status in "the community" of 2019 was much more mixed. He was intensely critical of the rise of the Effective Altruism movement, which he saw as using bogus claims about how to do the most good to prey on the smartest and most scrupulous people around. (I remember being at a party in 2015 and asking Michael what else I should spend my San Francisco software engineer money on, if not the EA charities I was considering. I was surprised when his answer was, "You.")

Another blow to Michael's reputation was dealt on 27 February 2019, when Anna published a comment badmouthing Michael and suggesting that talking to him was harmful, which I found disappointing—more so as I began to realize the implications.

I agreed with her point about how "ridicule of obviously-fallacious reasoning plays an important role in discerning which thinkers can (or can't) help" fill the role of vetting and common knowledge creation. That's why I was so heartbroken about the "categories are arbitrary, therefore trans women are women" thing, which deserved to be laughed out of the room. Why was she trying to ostracize the guy who was one of the very few to back me up on this incredibly obvious thing!? The reasons given to discredit Michael seemed weak. (He ... flatters people? He ... didn't tell people to abandon their careers? What?) And the evidence against Michael she offered in private didn't seem much more compelling (e.g., at a CfAR event, he had been insistent on continuing to talk to someone who Anna thought looked near psychosis and needed a break).

It made sense for Anna to not like Michael anymore because of his personal conduct, or because of his opposition to EA. (Expecting all of my friends to be friends with each other would be Geek Social Fallacy #4.) If she didn't want to invite him to CfAR stuff, fine. But what did she gain from publicly denouncing him as someone whose "lies/manipulations can sometimes disrupt [people's] thinking for long and costly periods of time"?! She said she was trying to undo the effects of her previous endorsements of him, and that the comment seemed like it ought to be okay by Michael's standards (which didn't include an expectation that people should collude to protect each other's reputations).


I wasn't the only one whose life was being disrupted by political drama in early 2019. On 22 February, Scott Alexander posted that the /r/slatestarcodex Culture War Thread was being moved to a new non–Slate Star Codex–branded subreddit in the hopes that would curb some of the harassment he had been receiving. Alexander claimed that according to poll data and his own impressions, the Culture War Thread featured a variety of ideologically diverse voices but had nevertheless acquired a reputation as being a hive of right-wing scum and villainy.

Yudkowsky Tweeted:

Your annual reminder that Slate Star Codex is not and never was alt-right, every real stat shows as much, and the primary promoters of this lie are sociopaths who get off on torturing incredibly nice targets like Scott A.

I found Yudkowsky's use of the word "lie" here interesting given his earlier eagerness to police the use of the word "lie" by gender-identity skeptics. With the support of my posse, I wrote to him again, a third time (Subject: "on defending against 'alt-right' categorization").

I said, imagine if one of Alexander's critics were to reply: "Using language in a way you dislike, openly and explicitly and with public focus on the language and its meaning, is not lying. The proposition you claim false (explicit advocacy of a white ethnostate?) is not what the speech is meant to convey—and this is known to everyone involved, it is not a secret. You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning. Now, maybe as a matter of policy, you want to make a case for language like 'alt-right' being used a certain way. Well, that's a separate debate then. But you're not making a stand for Truth in doing so, and your opponents aren't tricking anyone or trying to."

How would Yudkowsky react if someone said that? My model of the Sequences-era Yudkowsky of 2009 would say, "This is an intellectually dishonest attempt to sneak in connotations by performing a categorization and using an appeal-to-arbitrariness conversation-halter to avoid having to justify it; go read 'A Human's Guide to Words.'"

But I had no idea what the real Yudkowsky of 2019 would say. If the moral of the "hill of meaning in defense of validity" thread had been that the word "lie" should be reserved for per se direct falsehoods, well, what direct falsehood was being asserted by Scott's detractors? I didn't think anyone was claiming that, say, Scott identified as alt-right, any more than anyone was claiming that trans women have two X chromosomes. Commenters on /r/SneerClub had been pretty explicit in their criticism that the Culture War thread harbored racists (&c.) and possibly that Scott himself was a secret racist, with respect to a definition of racism that included the belief that there exist genetically mediated population differences in the distribution of socially relevant traits and that this probably had decision-relevant consequences that should be discussable somewhere.

And this was correct. For example, Alexander's "The Atomic Bomb Considered As Hungarian High School Science Fair Project" favorably cites Cochran et al.'s genetic theory of Ashkenazi achievement as "really compelling." Scott was almost certainly "guilty" of the category membership that the speech was meant to convey—it's just that Sneer Club got to choose the category. If a machine-learning classifier returns positive on both Scott Alexander and Richard Spencer, the correct response is not that the classifier is "lying" (what would that even mean?) but that the classifier is not very useful for understanding Scott Alexander's effects on the world.

Of course, Scott is great, and it was right that we should defend him from the bastards trying to ruin his reputation, and it was plausible that the most politically convenient way to do that was to pound the table and call them lying sociopaths rather than engaging with the substance of their claims—much as how someone being tried under an unjust law might plead "Not guilty" to save their own skin rather than tell the whole truth and hope for jury nullification.

But, I argued, political convenience came at a dire cost to our common interest. There was a proverb Yudkowsky had once failed to Google, that ran something like, "Once someone is known to be a liar, you might as well listen to the whistling of the wind."

Similarly, once someone is known to vary the epistemic standards of their public statements for political convenience—if they say categorizations can be lies when that happens to help their friends, but seemingly deny the possibility when that happens to make them look good politically ...

Well, you're still better off listening to them than the whistling of the wind, because the wind in various possible worlds is presumably uncorrelated with most of the things you want to know about, whereas clever arguers who don't tell explicit lies are constrained in how much they can mislead you. But it seems plausible that you might as well listen to any other arbitrary smart person with a blue check and 20K Twitter followers. It might be a useful exercise, for Yudkowsky to think of what he would actually say if someone with social power actually did this to him when he was trying to use language to reason about Something he had to Protect?

(Note, my claim here is not that "Pronouns aren't lies" and "Scott Alexander is not a racist" are similarly misinformative. Rather, I'm saying that whether "You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning" makes sense as a response to "X isn't a Y" shouldn't depend on the specific values of X and Y. Yudkowsky's behavior the other month had made it look like he thought that "You're not standing in defense of truth if ..." was a valid response when, say, X = "Caitlyn Jenner" and Y = "woman." I was saying that whether or not it's a valid response, we should, as a matter of local validity, apply the same standard when X = "Scott Alexander" and Y = "racist.")

Without disclosing any specific content from private conversations that may or may not have happened, I can say that our posse did not get the kind of engagement from Yudkowsky that we were hoping for.

Michael said that it seemed important that, if we thought Yudkowsky wasn't interested, we should have common knowledge among ourselves that we considered him to be choosing to be a cult leader.

I settled on Sara Bareilles's "Gonna Get Over You" as my breakup song with Yudkowsky and the rationalists, often listening to a cover of it on loop to numb the pain. I found the lyrics were readily interpretable as being about my problems, even if Sara Bareilles had a different kind of breakup in mind. ("I tell myself to let the story end"—the story of the rationalists as a world-changing intellectual movement. "And my heart will rest in someone else's hand"—Michael Vassar's. "And I'm not the girl that I intend to be"—self-explanatory.)8

Meanwhile, my email thread with Scott started up again. I expressed regret that all the times I had emailed him over the past couple years had been when I was upset about something (like psych hospitals, or—something else) and wanted something from him, treating him as a means rather than an end—and then, despite that regret, I continued prosecuting the argument.

One of Alexander's most popular Less Wrong posts ever had been about the noncentral fallacy, which Alexander called "the worst argument in the world": those who (for example) crow that abortion is murder (because murder is the killing of a human being), or that Martin Luther King, Jr. was a criminal (because he defied the segregation laws of the South), are engaging in a dishonest rhetorical maneuver in which they're trying to trick their audience into assigning attributes of the typical "murder" or "criminal" to what are very noncentral members of those categories.

Even if you're opposed to abortion, or have negative views about the historical legacy of Dr. King, this isn't the right way to argue. If you call Fiona a murderer, that causes me to form a whole bunch of implicit probabilistic expectations on the basis of what the typical "murder" is like—expectations about Fiona's moral character, about the suffering of a victim whose hopes and dreams were cut short, about Fiona's relationship with the law, &c.—most of which get violated when you reveal that the murder victim was an embryo.

In the form of a series of short parables, I tried to point out that Alexander's own "The Worst Argument in the World" is complaining about the same category-gerrymandering move that his "... Not Man for the Categories" comes out in favor of. We would not let someone get away with declaring, "I ought to accept an unexpected abortion or two deep inside the conceptual boundaries of what would normally not be considered murder if it'll save someone's life." Maybe abortion is wrong and relevantly similar to the central sense of "murder", but you need to make that case on the empirical merits, not by linguistic fiat (Subject: "twelve short stories about language").

Scott still didn't get it. He didn't see why he shouldn't accept one unit of categorizational awkwardness in exchange for sufficiently large utilitarian benefits. He made an analogy to some lore from the Glowfic collaborative fiction writing community, a story about orcs who had unwisely sworn a oath to serve the evil god Melkor. Though the orcs intend no harm of their own will, they're magically bound to obey Melkor's commands and serve as his terrible army or else suffer unbearable pain. Our heroine comes up with a solution: she founds a new religion featuring a deist God who also happens to be named "Melkor". She convinces the orcs that since the oath didn't specify which Melkor, they're free to follow her new God instead of evil Melkor, and the magic binding the oath apparently accepts this casuistry if the orcs themselves do.

Scott's attitude toward the new interpretation of the oath in the story was analogous to his thinking about transgenderedness: sure, the new definition may be a little awkward and unnatural, but it's not objectively false, and it made life better for so many orcs. If rationalists should win, then the true rationalist in this story was the one who thought up this clever hack to save an entire species.

I started drafting a long reply—but then I remembered that in recent discussion with my posse, the idea had come up that in-person meetings are better for resolving disagreements. Would Scott be up for meeting in person some weekend? Non-urgent. Ben would be willing to moderate, unless Scott wanted to suggest someone else, or no moderator.

Scott didn't want to meet. I considered resorting to the tool of cheerful prices, which I hadn't yet used against Scott—to say, "That's totally understandable! Would a financial incentive change your decision? For a two-hour meeting, I'd be happy to pay up to $4000 to you or your preferred charity. If you don't want the money, then let's table this. I hope you're having a good day." But that seemed sufficiently psychologically coercive and socially weird that I wasn't sure I wanted to go there. On 18 March, I emailed my posse asking what they thought—and then added that maybe they shouldn't reply until Friday, because it was Monday, and I really needed to focus on my dayjob that week.

This is the part where I began to ... overheat. I tried ("tried") to focus on my dayjob, but I was just so angry. Did Scott really not understand the rationality-relevant distinction between "value-dependent categories as a result of caring about predicting different variables" (as explained by the dagim/water-dwellers vs. fish example in "... Not Man for the Categories") and "value-dependent categories in order to not make my friends sad"? Was he that dumb? Or was it that he was only verbal-smart, and this is the sort of thing that only makes sense if you've ever been good at linear algebra? (Such that the language of "only running your clustering algorithm on the subspace of the configuration space spanned by the variables that are relevant to your decisions" would come naturally.) Did I need to write a post explaining just that one point in mathematical detail, with executable code and a worked example with entropy calculations?

My dayjob boss made it clear that he was expecting me to have code for my current Jira tickets by noon the next day, so I deceived myself into thinking I could accomplish that by staying at the office late. Maybe I could have caught up, if it were just a matter of the task being slightly harder than anticipated and I weren't psychologically impaired from being hyper-focused on the religious war. The problem was that focus is worth 30 IQ points, and an IQ 100 person can't do my job.

I was in so much (psychological) pain. Or at least, in one of a series of emails to my posse that night, I felt motivated to type the sentence, "I'm in so much (psychological) pain." I'm never sure how to interpret my own self-reports, because even when I'm really emotionally trashed (crying, shaking, randomly yelling, &c.), I think I'm still noticeably incentivizable: if someone were to present a credible threat (like slapping me and telling me to snap out of it), then I would be able to calm down. There's some sort of game-theory algorithm in the brain that feels subjectively genuine distress (like crying or sending people too many hysterical emails) but only when it can predict that it will be rewarded with sympathy or at least tolerated: tears are a discount on friendship.

I tweeted a Sequences quote (the mention of @ESYudkowsky being to attribute credit, I told myself; I figured Yudkowsky had enough followers that he probably wouldn't see a notification):

"—and if you still have something to protect, so that you MUST keep going, and CANNOT resign and wisely acknowledge the limitations of rationality— [1/3]

"—then you will be ready to start your journey[.] To take sole responsibility, to live without any trustworthy defenses, and to forge a higher Art than the one you were once taught. [2/3]

"No one begins to truly search for the Way until their parents have failed them, their gods are dead, and their tools have shattered in their hand." —@ESYudkowsky (https://www.lesswrong.com/posts/wustx45CPL5rZenuo/no-safe-defense-not-even-science) [end/3]

Only it wasn't quite appropriate. The quote is about failure resulting in the need to invent new methods of rationality, better than the ones you were taught. But the methods I had been taught were great! I didn't have a pressing need to improve on them! I just couldn't cope with everyone else having forgotten!

I did eventually get some dayjob work done that night, but I didn't finish the whole thing my manager wanted done by the next day, and at 4 a.m., I concluded that I needed sleep, the lack of which had historically been very dangerous for me (being the trigger for my 2013 and 2017 psychotic breaks and subsequent psych imprisonments). We really didn't want another outcome like that. There was a couch in the office, and probably another four hours until my coworkers started to arrive. The thing I needed to do was just lie down on the couch in the dark and have faith that sleep would come. Meeting my manager's deadline wasn't that important. When people came in to the office, I might ask for help getting an Uber home? Or help buying melatonin? The important thing was to be calm.

I sent an email explaining this to Scott and my posse and two other friends (Subject: "predictably bad ideas").

Lying down didn't work. So at 5:26 a.m., I sent an email to Scott cc'ing my posse plus Anna about why I was so mad (both senses). I had a better draft sitting on my desktop at home, but since I was here and couldn't sleep, I might as well type this version (Subject: "five impulsive points, hastily written because I just can't even (was: Re: predictably bad ideas)"). Scott had been continuing to insist it's okay to gerrymander category boundaries for trans people's mental health, but there were a few things I didn't understand. If creatively reinterpreting the meanings of words because the natural interpretation would make people sad is okay, why didn't that generalize to an argument in favor of outright lying when the truth would make people sad? The mind games seemed crueler to me than a simple lie. Also, if "mental health benefits for trans people" matter so much, then why didn't my mental health matter? Wasn't I trans, sort of? Getting shut down by appeal-to-utilitarianism when I was trying to use reason to make sense of the world was observably really bad for my sanity!

Also, Scott had asked me if it wouldn't be embarrassing if the community solved Friendly AI and went down in history as the people who created Utopia forever, and I had rejected it because of gender stuff. But the original reason it had ever seemed remotely plausible that we would create Utopia forever wasn't "because we're us, the world-saving good guys," but because we were going to perfect an art of systematically correct reasoning. If we weren't going to do systematically correct reasoning because that would make people sad, then that undermined the reason that it was plausible that we would create Utopia forever.

Also-also, Scott had proposed a super–Outside View of the culture war as an evolutionary process that produces memes optimized to trigger PTSD syndromes and suggested that I think of that as what was happening to me. But, depending on how much credence Scott put in social proof, mightn't the fact that I managed to round up this whole posse to help me repeatedly argue with (or harass) Yudkowsky shift his estimate over whether my concerns had some objective merit that other people could see, too? It could simultaneously be the case that I had culture-war PTSD and my concerns had merit.

Michael replied at 5:58 a.m., saying that everyone's first priority should be making sure that I could sleep—that given that I was failing to adhere to my commitments to sleep almost immediately after making them, I should be interpreted as urgently needing help, and that Scott had comparative advantage in helping, given that my distress was most centrally over Scott gaslighting me, asking me to consider the possibility that I was wrong while visibly not considering the same possibility regarding himself.

That seemed a little harsh on Scott to me. At 6:14 a.m. and 6:21 a.m., I wrote a couple emails to everyone that my plan was to get a train back to my own apartment to sleep, that I was sorry for making such a fuss despite being incentivizable while emotionally distressed, that I should be punished in accordance with the moral law for sending too many hysterical emails because I thought I could get away with it, that I didn't need Scott's help, and that I thought Michael was being a little aggressive about that, but that I guessed that's also kind of Michael's style.

Michael was furious with me. ("What the FUCK Zack!?! Calling now," he emailed me at 6:18 a.m.) I texted and talked with him on my train ride home. He seemed to have a theory that people who are behaving badly, as Scott was, will only change when they see a victim who is being harmed. Me escalating and then immediately deescalating just after Michael came to help was undermining the attempt to force an honest confrontation, such that we could get to the point of having a Society with morality or punishment.

Anyway, I did get to my apartment and sleep for a few hours. One of the other friends I had cc'd on some of the emails, whom I'll call "Meredith", came to visit me later that morning with her 2½-year-old son—I mean, her son at the time.

(Incidentally, the code that I had written intermittently between 11 p.m. and 4 a.m. was a horrible bug-prone mess, and the company has been paying for it ever since.)

At some level, I wanted Scott to know how frustrated I was about his use of "mental health for trans people" as an Absolute Denial Macro. But when Michael started advocating on my behalf, I started to minimize my claims because I had a generalized attitude of not wanting to sell myself as a victim. Ben pointed out that making oneself mentally ill in order to extract political concessions only works if you have a lot of people doing it in a visibly coordinated way—and even if it did work, getting into a dysphoria contest with trans people didn't seem like it led anywhere good.

I supposed that in Michael's worldview, aggression is more honest than passive-aggression. That seemed true, but I was psychologically limited in how much overt aggression I was willing to deploy against my friends. (And particularly Yudkowsky, whom I still hero-worshiped.) But clearly, the tension between "I don't want to do too much social aggression" and "Losing the Category War within the rationalist community is absolutely unacceptable" was causing me to make wildly inconsistent decisions. (Emailing Scott at 4 a.m. and then calling Michael "aggressive" when he came to defend me was just crazy: either one of those things could make sense, but not both.)

Did I just need to accept that was no such a thing as a "rationalist community"? (Sarah had told me as much two years ago while tripsitting me during my psychosis relapse, but I hadn't made the corresponding mental adjustments.)

On the other hand, a possible reason to be attached to the "rationalist" brand name and social identity that wasn't just me being stupid was that the way I talk had been trained really hard on this subculture for ten years. Most of my emails during this whole campaign had contained multiple Sequences or Slate Star Codex links that I could expect the recipients to have read. I could use the phrase "Absolute Denial Macro" in conversation and expect to be understood. If I gave up on the "rationalists" being a thing, and went out into the world to make friends with Quillette readers or arbitrary University of Chicago graduates, then I would lose all that accumulated capital. Here, I had a massive home territory advantage because I could appeal to Yudkowsky's writings about the philosophy of language from ten years ago and people couldn't say, "Eliezer who? He's probably a Bad Man."

The language I spoke was mostly educated American English, but I relied on subculture dialect for a lot. My sister has a chemistry doctorate from MIT (and so speaks the language of STEM intellectuals generally), and when I showed her "... To Make Predictions", she reported finding it somewhat hard to read, likely because I casually use phrases like "thus, an excellent motte" and expect to be understood without the reader taking 10 minutes to read the link. That essay, which was me writing from the heart in the words that came most naturally to me, could not be published in Quillette. The links and phraseology were just too context bound.

Maybe that's why I felt like I had to stand my ground and fight for the world I was made in, even though the contradiction between the war effort and my general submissiveness had me making crazy decisions.

Michael said that a reason to make a stand here in "the community" was because if we didn't, the beacon of "rationalism" would continue to lure and mislead others—but that more importantly, we needed to figure out how to win this kind of argument decisively, as a group. We couldn't afford to accept a status quo of accepting defeat when faced with bad faith arguments in general. Ben reported writing to Scott to ask him to alter the beacon so that people like me wouldn't think "the community" was the place to go for the rationality thing anymore.

As it happened, the next day, we saw these Tweets from @ESYudkowsky, linking to a Quillette article interviewing Lisa Littman about her work positing a socially contagious "rapid onset" type of gender dysphoria among young females:

Everything more complicated than protons tends to come in varieties. Hydrogen, for example, has isotopes. Gender dysphoria involves more than one proton and will probably have varieties. https://quillette.com/2019/03/19/an-interview-with-lisa-littman-who-coined-the-term-rapid-onset-gender-dysphoria/

To be clear, I don't know much about gender dysphoria. There's an allegation that people are reluctant to speciate more than one kind of gender dysphoria. To the extent that's not a strawman, I would say only in a generic way that GD seems liable to have more than one species.

(Why now? Maybe he saw the tag in my "tools have shattered" Tweet on Monday, or maybe the Quillette article was just timely?)

The most obvious reading of these Tweets was as a political concession to me. The two-type taxonomy of MtF was the thing I was originally trying to talk about, back in 2016–2017, before getting derailed onto the present philosophy-of-language war, and here Yudkowsky was backing up my side on that.

At this point, some readers might think that this should have been the end of the matter, that I should have been satisfied. I had started the recent drama flare-up because Yudkowsky had Tweeted something unfavorable to my agenda. But now, Yudkowsky was Tweeting something favorable to my agenda! Wouldn't it be greedy and ungrateful for me to keep criticizing him about the pronouns and language thing, given that he'd thrown me a bone here? Shouldn't I call it even?

That's not how it works. The entire concept of "sides" to which one can make "concessions" is an artifact of human coalitional instincts. It's not something that makes sense as a process for constructing a map that reflects the territory. My posse and I were trying to get a clarification about a philosophy-of-language claim Yudkowsky had made a few months prior ("you're not standing in defense of truth if [...]"). Why would we stop prosecuting that because of this unrelated Tweet about the etiology of gender dysphoria? That wasn't the thing we were trying to clarify!

Moreover—and I'm embarrassed that it took me another day to realize this—this new argument from Yudkowsky about the etiology of gender dysphoria was wrong. As I would later get around to explaining in "On the Argumentative Form 'Super-Proton Things Tend to Come in Varieties'", when people claim that some psychological or medical condition "comes in varieties", they're making a substantive empirical claim that the causal or statistical structure of the condition is usefully modeled as distinct clusters, not merely making the trivial observation that instances of the condition are not identical down to the subatomic level.

So we shouldn't think that there are probably multiple kinds of gender dysphoria because things are made of protons. If anything, a priori reasoning about the cognitive function of categorization should actually cut in the other direction, (mildly) against rather than in favor of multi-type theories: you only want to add more categories to your theory if they can pay for their additional complexity with better predictions. If you believe in Blanchard–Bailey–Lawrence's two-type taxonomy of MtF, or Littman's proposed rapid-onset type, it should be on the empirical merits, not because multi-type theories are a priori more likely to be true (which they aren't).

Had Yudkowsky been thinking that maybe if he Tweeted something favorable to my agenda, then I and the rest of Michael's gang would be satisfied and leave him alone?

But if there's some other reason you suspect there might be multiple species of dysphoria, but you tell people your suspicion is because "everything more complicated than protons tends to come in varieties", you're still misinforming people for political reasons, which was the general problem we were trying to alert Yudkowsky to. Inventing fake rationality lessons in response to political pressure is not okay, and the fact that in this case the political pressure happened to be coming from me didn't make it okay.

I asked the posse if this analysis was worth sending to Yudkowsky. Michael said it wasn't worth the digression. He asked if I was comfortable generalizing from Scott's behavior, and what others had said about fear of speaking openly, to assuming that something similar was going on with Eliezer? If so, then now that we had common knowledge, we needed to confront the actual crisis, "that dread is tearing apart old friendships and causing fanatics to betray everything that they ever stood for while its existence is still being denied."


That week, former MIRI researcher Jessica Taylor joined our posse (being at an in-person meeting with Ben and Sarah and another friend on the seventeenth, and getting tagged in subsequent emails). I had met Jessica for the first time in March 2017, shortly after my psychotic break, and I had been part of the group trying to take care of her when she had her own break in late 2017, but other than that, we hadn't been particularly close.

Significantly for political purposes, Jessica is trans. We didn't have to agree up front on all gender issues for her to see the epistemology problem with "... Not Man for the Categories", and to say that maintaining a narcissistic fantasy by controlling category boundaries wasn't what she wanted, as a trans person. (On the seventeenth, when I lamented the state of a world that incentivized us to be political enemies, her response was, "Well, we could talk about it first.") Michael said that me and Jessica together had more moral authority than either of us alone.

As it happened, I ran into Scott on the BART train that Friday, the twenty-second. He said he wasn't sure why the oft-repeated moral of "A Human's Guide to Words" had been "You can't define a word any way you want" rather than "You can define a word any way you want, but then you have to deal with the consequences."

Ultimately, I thought this was a pedagogy decision that Yudkowsky had gotten right back in 2008. If you write your summary slogan in relativist language, people predictably take that as license to believe whatever they want without having to defend it. Whereas if you write your summary slogan in objectivist language—so that people know they don't have social permission to say, "It's subjective, so I can't be wrong"—then you have some hope of sparking useful thought about the exact, precise ways that specific, definite things are relative to other specific, definite things.

I told Scott I would send him one more email with a piece of evidence about how other "rationalists" were thinking about the categories issue and give my commentary on the parable about orcs, and then the present thread would probably drop there.

Concerning what others were thinking: on Discord in January, Kelsey Piper had told me that everyone else experienced their disagreement with me as being about where the joints are and which joints are important, where usability for humans was a legitimate criterion of importance, and it was annoying that I thought they didn't believe in carving reality at the joints at all and that categories should be whatever makes people happy.

I didn't want to bring it up at the time because I was so overjoyed that the discussion was actually making progress on the core philosophy-of-language issue, but Scott did seem to be pretty explicit that his position was about happiness rather than usability? If Kelsey thought she agreed with Scott, but actually didn't, that sham consensus was a bad sign for our collective sanity, wasn't it?

As for the parable about orcs, I thought it was significant that Scott chose to tell the story from the standpoint of non-orcs deciding what verbal behaviors to perform while orcs are around, rather than the standpoint of the orcs themselves. For one thing, how do you know that serving evil-Melkor is a life of constant torture? Is it at all possible that someone has given you misleading information about that?

Moreover, you can't just give an orc a clever misinterpretation of an oath and have them believe it. First you have to cripple their general ability to correctly interpret oaths, for the same reason that you can't get someone to believe that 2+2=5 without crippling their general ability to do arithmetic. We weren't talking about a little "white lie" that the listener will never get to see falsified (like telling someone their dead dog is in heaven); the orcs already know the text of the oath, and you have to break their ability to understand it. Are you willing to permanently damage an orc's ability to reason in order to save them pain? For some sufficiently large amount of pain, surely. But this isn't a choice to make lightly—and the choices people make to satisfy their own consciences don't always line up with the volition of their alleged beneficiaries. We think we can lie to save others from pain, without wanting to be lied to ourselves. But behind the veil of ignorance, it's the same choice!

I also had more to say about philosophy of categories: I thought I could be more rigorous about the difference between "caring about predicting different variables" and "caring about consequences", in a way that Eliezer would have to understand even if Scott didn't. (Scott had claimed that he could use gerrymandered categories and still be just as good at making predictions—but that's just not true if we're talking about the internal use of categories as a cognitive algorithm, rather than mere verbal behavior. It's easy to say "X is a Y" for arbitrary X and Y if the stakes demand it, but that's not the same thing as using that concept of Y internally as part of your world-model.)

But after consultation with the posse, I concluded that further email prosecution was not useful at this time; the philosophy argument would work better as a public Less Wrong post. So my revised Category War to-do list was:

  • Send the brief wrapping-up/end-of-conversation email to Scott (with the Discord anecdote about Kelsey and commentary on the orc story).
  • Mentally write off Scott, Eliezer, and the so-called "rationalist" community as a loss so that I wouldn't be in horrible emotional pain from cognitive dissonance all the time.
  • Write up the mathy version of the categories argument for Less Wrong (which I thought might take a few months—I had a dayjob, and write slowly, and might need to learn some new math, which I'm also slow at).
  • Then email the link to Scott and Eliezer asking for a signal boost and/or court ruling.

Ben didn't think the mathematically precise categories argument was the most important thing for Less Wrong readers to know about: a similarly careful explanation of why I'd written off Scott, Eliezer, and the "rationalists" would be way more valuable.

I could see the value he was pointing at, but something in me balked at the idea of attacking my friends in public (Subject: "treachery, faith, and the great river (was: Re: DRAFTS: 'wrapping up; or, Orc-ham's razor' and 'on the power and efficacy of categories')").

Ben had previously written (in the context of the effective altruism movement) about how holding criticism to a higher standard than praise distorts our collective map. He was obviously correct that this was a distortionary force relative to what ideal Bayesian agents would do, but I was worried that when we're talking about criticism of people rather than ideas, the removal of the distortionary force would just result in social conflict (and not more truth). Criticism of institutions and social systems should be filed under "ideas" rather than "people", but the smaller-scale you get, the harder this distinction is to maintain: criticizing, say, "the Center for Effective Altruism", somehow feels more like criticizing Will MacAskill personally than criticizing "the United States" does, even though neither CEA nor the U.S. is a person.

That was why I couldn't give up faith that honest discourse eventually wins. Under my current strategy and consensus social norms, I could criticize Scott or Kelsey or Ozy's ideas without my social life dissolving into a war of all against all, whereas if I were to give in to the temptation to flip a table and say, "Okay, now I know you guys are just messing with me," then I didn't see how that led anywhere good, even if they really were.

Jessica explained what she saw as the problem with this. What Ben was proposing was creating clarity about behavioral patterns. I was saying that I was afraid that creating such clarity is an attack on someone. But if so, then my blog was an attack on trans people. What was going on here?

Socially, creating clarity about behavioral patterns is construed as an attack and can make things worse for someone. For example, if your livelihood is based on telling a story about you and your flunkies being the only sane truthseeking people in the world, then me demonstrating that you don't care about the truth when it's politically inconvenient is a threat to your marketing story and therefore to your livelihood. As a result, it's easier to create clarity down power gradients than up them: it was easy for me to blow the whistle on trans people's narcissistic delusions, but hard to blow the whistle on Yudkowsky's.9

But selectively creating clarity down but not up power gradients just reinforces existing power relations—in the same way that selectively criticizing arguments with politically unfavorable conclusions only reinforces your current political beliefs. I shouldn't be able to get away with claiming that calling non-exclusively-androphilic trans women delusional perverts is okay on the grounds that that which can be destroyed by the truth should be, but that calling out Alexander and Yudkowsky would be unjustified on the grounds of starting a war or whatever. Jessica was on board with a project to tear down narcissistic fantasies in general, but not a project that starts by tearing down trans people's narcissistic fantasies, then emits spurious excuses for not following that effort where it leads.

Somewhat apologetically, I replied that the distinction between truthfully, publicly criticizing group identities and named individuals still seemed important to me?—as did avoiding leaking info from private conversations. I would be more comfortable writing a scathing blog post about the behavior of "rationalists", than about a specific person not adhering to good discourse norms in an email conversation that they had good reason to expect to be private. I thought I was consistent about this; contrast my writing with the way that some anti-trans writers name and shame particular individuals. (The closest I had come was mentioning Danielle Muscato as someone who doesn't pass—and even there, I admitted it was "unclassy" and done out of desperation.) I had to acknowledge that criticism of non-exclusively-androphilic trans women in general implied criticism of Jessica, and criticism of "rationalists" in general implied criticism of Yudkowsky and Alexander and me, but the extra inferential step and "fog of probability" seemed to make the speech act less of an attack. Was I wrong?

Michael said this was importantly backwards: less precise targeting is more violent. If someone said, "Michael Vassar is a terrible person," he would try to be curious, but if they didn't have an argument, he would tend to worry more "for" them and less "about" them, whereas if someone said, "The Jews are terrible people," he saw that as a more serious threat to his safety. (And rationalists and trans women are exactly the sort of people who get targeted by the same people who target Jews.)


Polishing the advanced categories argument from earlier email drafts into a solid Less Wrong post didn't take that long: by 6 April 2019, I had an almost complete draft of the new post, "Where to Draw the Boundaries?", that I was pretty happy with.

The title (note: "boundaries", plural) was a play off of "Where to Draw the Boundary?" (note: "boundary", singular), a post from Yudkowsky's original Sequence on the 37 ways in which words can be wrong. In "... Boundary?", Yudkowsky asserts (without argument, as something that all educated people already know) that dolphins don't form a natural category with fish ("Once upon a time it was thought that the word 'fish' included dolphins [...] Or you could stop playing nitwit games and admit that dolphins don't belong on the fish list"). But Alexander's "... Not Man for the Categories" directly contradicts this, asserting that there's nothing wrong with the biblical Hebrew word dagim encompassing both fish and cetaceans (dolphins and whales). So who's right—Yudkowsky (2008) or Alexander (2014)? Is there a problem with dolphins being "fish", or not?

In "... Boundaries?", I unify the two positions and explain how both Yudkowsky and Alexander have a point: in high-dimensional configuration space, there's a cluster of finned water-dwelling animals in the subspace of the dimensions along which finned water-dwelling animals are similar to each other, and a cluster of mammals in the subspace of the dimensions along which mammals are similar to each other, and dolphins belong to both of them. Which subspace you pay attention to depends on your values: if you don't care about predicting or controlling some particular variable, you have no reason to look for similarity clusters along that dimension.

But given a subspace of interest, the technical criterion of drawing category boundaries around regions of high density in configuration space still applies. There is Law governing which uses of communication signals transmit which information, and the Law can't be brushed off with, "whatever, it's a pragmatic choice, just be nice." I demonstrate the Law with a couple of simple mathematical examples: if you redefine a codeword that originally pointed to one cluster in ℝ³, to also include another, that changes the quantitative predictions you make about an unobserved coordinate given the codeword; if an employer starts giving the title "Vice President" to line workers, that decreases the mutual information between the job title and properties of the job.

(Jessica and Ben's discussion of the job title example in relation to the Wikipedia summary of Jean Baudrillard's Simulacra and Simulation got published separately and ended up taking on a life of its own in future posts, including a number of posts by other authors.)

Sarah asked if the math wasn't a bit overkill: were the calculations really necessary to make the basic point that good definitions should be about classifying the world, rather than about what's pleasant or politically expedient to say?

I thought the math was important as an appeal to principle—and as intimidation. (As it was written, the tenth virtue is precision! Even if you cannot do the math, knowing that the math exists tells you that the dance step is precise and has no room in it for your whims.)

"... Boundaries?" explains all this in the form of discourse with a hypothetical interlocutor arguing for the I-can-define-a-word-any-way-I-want position. In the hypothetical interlocutor's parts, I wove in verbatim quotes (without attribution) from Alexander ("an alternative categorization system is not an error, and borders are not objectively true or false") and Yudkowsky ("You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning"; "Using language in a way you dislike is not lying. The propositions you claim false [...] is not what the [...] is meant to convey, and this is known to everyone involved; it is not a secret") and Bensinger ("doesn't unambiguously refer to the thing you're trying to point at").

My thinking here was that the posse's previous email campaigns had been doomed to failure by being too closely linked to the politically contentious object-level topic, which reputable people had strong incentives not to touch with a ten-meter pole. So if I wrote this post just explaining what was wrong with the claims Yudkowsky and Alexander had made about the philosophy of language, with perfectly innocent examples about dolphins and job titles, that would remove the political barrier to Yudkowsky correcting the philosophy of language error. If someone with a threatening social-justicey aura were to say, "Wait, doesn't this contradict what you said about trans people earlier?", the reputable people could stonewall them. (Stonewall them and not me!)

Another reason someone might be reluctant to correct mistakes when pointed out is the fear that such a policy could be abused by motivated nitpickers. It would be pretty annoying to be obligated to churn out an endless stream of trivial corrections by someone motivated to comb through your entire portfolio and point out every little thing you did imperfectly, ever.

I wondered if maybe, in Scott or Eliezer's mental universe, I was a blameworthy (or pitiably mentally ill) nitpicker for flipping out over a blog post from 2014 (!) and some Tweets (!!) from November. I, too, had probably said things that were wrong five years ago.

But I thought I had made a pretty convincing case that a lot of people were making a correctable and important rationality mistake, such that the cost of a correction (about the philosophy of language specifically, not any possible implications for gender politics) would be justified here. As Ben pointed out, if someone had put this much effort into pointing out an error I had made four months or five years ago and making careful arguments for why it was important to get the right answer, I probably would put some serious thought into it.

I could see a case that it was unfair of me to include political subtext and then only expect people to engage with the politically clean text, but if we weren't going to get into full-on gender-politics on Less Wrong (which seemed like a bad idea), but gender politics was motivating an epistemology error, I wasn't sure what else I was supposed to do. I was pretty constrained here!

(I did regret having accidentally poisoned the well the previous month by impulsively sharing "Blegg Mode" as a Less Wrong linkpost. "Blegg Mode" had originally been drafted as part of "... To Make Predictions" before getting spun off as a separate post. Frustrated in March at our failing email campaign, I thought it was politically "clean" enough to belatedly share, but it proved to be insufficiently deniably allegorical, as evidenced by the 60-plus-entry trainwreck of a comments section. It's plausible that some portion of the Less Wrong audience would have been more receptive to "... Boundaries?" if they hadn't been alerted to the political context by the comments on the "Blegg Mode" linkpost.)

On 13 April 2019, I pulled the trigger on publishing "... Boundaries?", and wrote to Yudkowsky again, a fourth time (!), asking if he could either publicly endorse the post, or publicly comment on what he thought the post got right and what he thought it got wrong—and that if engaging on this level was too expensive for him in terms of spoons, if there was any action I could take to somehow make it less expensive. The reason I thought this was important, I explained, was that if rationalists in good standing find themselves in a persistent disagreement about rationality itself, that seemed like a major concern for our common interest, something we should be eager to definitively settle in public (or at least clarify the current state of the disagreement). In the absence of a rationality court of last resort, I feared the closest thing we had was an appeal to Eliezer Yudkowsky's personal judgment. Despite the context in which the dispute arose, this wasn't a political issue. The post I was asking for his comment on was just about the mathematical laws governing how to talk about, e.g., dolphins. We had nothing to be afraid of here. (Subject: "movement to clarity; or, rationality court filing").

I got some pushback from Ben and Jessica about claiming that this wasn't "political". What I meant by that was to emphasize (again) that I didn't expect Yudkowsky or "the community" to take a public stance on gender politics. Rather, I was trying to get "us" to take a stance in favor of the kind of epistemology that we were doing in 2008. It turns out that epistemology has implications for gender politics that are unsafe, but that's more inferential steps. And I guess I didn't expect the sort of people who would punish good epistemology to follow the inferential steps?

Anyway, again without revealing any content from the other side of any private conversations that may or may not have occurred, we did not get any public engagement from Yudkowsky.

It seemed that the Category War was over, and we lost.

We lost?! How could we lose?! The philosophy here was clear-cut. This shouldn't be hard or expensive or difficult to clear up. I could believe that Alexander was "honestly" confused, but Yudkowsky?

I could see how, under ordinary circumstances, asking Yudkowsky to weigh in on my post would be inappropriately demanding of a Very Important Person's time, given that an ordinary programmer such as me was surely as a mere worm in the presence of the great Eliezer Yudkowsky. (I would have humbly given up much sooner if I hadn't gotten social proof from Michael and Ben and Sarah and "Riley" and Jessica.)

But the only reason for my post to exist was because it would be even more inappropriately demanding to ask for a clarification in the original gender-political context. The economist Thomas Schelling (of "Schelling point" fame) once wrote about the use of clever excuses to help one's negotiating counterparty release themself from a prior commitment: "One must seek [...] a rationalization by which to deny oneself too great a reward from the opponent's concession, otherwise the concession will not be made."10 This is what I was trying to do when soliciting—begging for—engagement or endorsement of "... Boundaries?" By making the post be about dolphins, I was trying to deny myself too great of a reward on the gender-politics front. I don't think it was inappropriately demanding to expect "us" (him) to be correct about the cognitive function of categorization. I was trying to be as accommodating as I could, short of just letting him (us?) be wrong.

I would have expected him to see why we had to make a stand here, where the principles of reasoning that made it possible for words to be assigned interpretations at all were under threat.

A hill of validity in defense of meaning.

Maybe that's not how politics works? Could it be that, somehow, the mob-punishment mechanisms that weren't smart enough to understand the concept of "bad argument (categories are arbitrary) for a true conclusion (trans people are OK)", were smart enough to connect the dots between my broader agenda and my abstract philosophy argument, such that VIPs didn't think they could endorse my philosophy argument, without it being construed as an endorsement of me and my detailed heresies?

Jessica mentioned talking with someone about me writing to Yudkowsky and Alexander about the category boundary issue. This person described having a sense that I should have known it wouldn't work—because of the politics involved, not because I wasn't right. I thought Jessica's takeaway was poignant:

Those who are savvy in high-corruption equilibria maintain the delusion that high corruption is common knowledge, to justify expropriating those who naively don't play along, by narratizing them as already knowing and therefore intentionally attacking people, rather than being lied to and confused.

Should I have known that it wouldn't work? Didn't I "already know", at some level?

I guess in retrospect, the outcome does seem kind of obvious—that it should have been possible to predict in advance, and to make the corresponding update without so much fuss and wasting so many people's time.

But it's only "obvious" if you take as a given that Yudkowsky is playing a savvy Kolmogorov complicity strategy like any other public intellectual in the current year.

Maybe this seems banal if you haven't spent your entire adult life in his robot cult. From anyone else in the world, I wouldn't have had a problem with the "hill of meaning in defense of validity" thread—I would have respected it as a solidly above-average philosophy performance before setting the bozo bit on the author and getting on with my day. But since I did spend my entire adult life in Yudkowsky's robot cult, trusting him the way a Catholic trusts the Pope, I had to assume that it was an "honest mistake" in his rationality lessons, and that honest mistakes could be honestly corrected if someone put in the effort to explain the problem. The idea that Eliezer Yudkowsky was going to behave just as badly as any other public intellectual in the current year was not really in my hypothesis space.

Ben shared the account of our posse's email campaign with someone who commented that I had "sacrificed all hope of success in favor of maintaining his own sanity by CC'ing you guys." That is, if I had been brave enough to confront Yudkowsky by myself, maybe there was some hope of him seeing that the game he was playing was wrong. But because I was so cowardly as to need social proof (because I believed that an ordinary programmer such as me was as a mere worm in the presence of the great Eliezer Yudkowsky), it probably just looked to him like an illegible social plot originating from Michael.

One might wonder why this was such a big deal to us. Okay, so Yudkowsky had prevaricated about his own philosophy of language for political reasons, and he couldn't be moved to clarify even after we spent an enormous amount of effort trying to explain the problem. So what? Aren't people wrong on the internet all the time?

This wasn't just anyone being wrong on the internet. In an essay on the development of cultural traditions, Scott Alexander had written that rationalism is the belief that Eliezer Yudkowsky is the rightful caliph. To no small extent, I and many other people had built our lives around a story that portrayed Yudkowsky as almost uniquely sane—a story that put MIRI, CfAR, and the "rationalist community" at the center of the universe, the ultimate fate of the cosmos resting on our individual and collective mastery of the hidden Bayesian structure of cognition.

But my posse and I had just falsified to our satisfaction the claim that Yudkowsky was currently sane in the relevant way. Maybe he didn't think he had done anything wrong (because he hadn't strictly lied), and probably a normal person would think we were making a fuss about nothing, but as far as we were concerned, the formerly rightful caliph had relinquished his legitimacy. A so-called "rationalist" community that couldn't clarify this matter of the cognitive function of categories was a sham. Something had to change if we wanted a place in the world for the spirit of "naïve" (rather than politically savvy) inquiry to survive.

(To be continued. Yudkowsky would eventually clarify his position on the philosophy of categorization in September 2020—but the story leading up to that will have to wait for another day.)


  1. Similarly, in automobile races, you want rules to enforce that all competitors have the same type of car, for some commonsense operationalization of "the same type", because a race between a sports car and a moped would be mostly measuring who has the sports car, rather than who's the better racer. 

  2. And in the case of sports, the facts are so lopsided that if we must find humor in the matter, it really goes the other way. A few years later, Lia Thomas would dominate an NCAA women's swim meet by finishing 4.2 standard deviations (!!) earlier than the median competitor, and Eliezer Yudkowsky feels obligated to pretend not to see the problem? You've got to admit, that's a little bit funny. 

  3. Despite my misgivings, this blog was still published under a pseudonym at the time; it would have been hypocritical of me to accuse someone of cowardice about what they're willing to attach their real name to. 

  4. The title was a pun referencing computer scientist Scott Aaronson's post advocating "The Kolmogorov Option", serving the cause of Truth by cultivating a bubble that focuses on specific truths that won't get you in trouble with the local political authorities. Named after the Soviet mathematician Andrey Kolmogorov, who knew better than to pick fights he couldn't win. 

  5. In Part One, Chapter VII, "The Exploiters and the Exploited". 

  6. CfAR had been spun off from MIRI in 2012 as a dedicated organization for teaching rationality. 

  7. Yudkowsky's Sequences (except the last) had originally been published on Overcoming Bias before the creation of Less Wrong in early 2009. 

  8. In general, I'm proud of my careful choices of breakup songs. For another example, my breakup song with institutionalized schooling was Taylor Swift's "We Are Never Ever Getting Back Together", a bitter renunciation of an on-again-off-again relationship ("I remember when we broke up / The first time") with a ex who was distant and condescending ("And you, would hide away and find your peace of mind / With some indie record that's much cooler than mine"), thematically reminiscent of my ultimately degree-less string of bad relationships with UC Santa Cruz (2006–2007), Heald College (2008), Diablo Valley College (2010–2012), and San Francisco State University (2012–2013).

    The fact that I've invested so much symbolic significance in carefully-chosen songs by female vocalists to mourn relationships with abstract perceived institutional authorities, and conspicuously not for any relationships with actual women, maybe tells you something about how my life has gone. 

  9. Probably a lot of other people who lived in Berkeley would find it harder to criticize trans people than to criticize some privileged white guy named Yudkowski or whatever. But those weren't the relevant power gradients in my world. 

  10. The Strategy of Conflict, Ch. 2, "An Essay on Bargaining" 


Blanchard's Dangerous Idea and the Plight of the Lucid Crossdreamer

I'm beginning to wonder if he's constructed an entire system of moral philosophy around the effects of the loyalty mod—a prospect that makes me distinctly uneasy. It would hardly be the first time a victim of mental illness has responded to their affliction that way—but it would certainly be the first time I've found myself in the vulnerable position of sharing the brain-damaged prophet's impairment, down to the last neuron.

Quarantine by Greg Egan

In a previous post, "Sexual Dimorphism in Yudkowsky's Sequences, in Relation to My Gender Problems", I told the story about how I've "always" (since puberty) had this obsessive erotic fantasy about being magically transformed into a woman and used to think it was immoral to believe in psychological sex differences, until I read these Sequences of blog posts about how reasoning works by someone named Eliezer Yudkowsky—where one particularly influential-to-me post was the one that explained why fantasies of changing sex are much easier said than done, because the tantalizingly short English phrase doesn't capture the complex implementation details of the real physical universe.

At the time, this was my weird personal thing, which I did not anticipate there being any public interest in blogging about. In particular, I didn't think of myself as being "transgender." The whole time—the dozen years I spent reading everything I could about sex and gender and transgender and feminism and evopsych, and doing various things with my social presentation to try to seem not-masculine—sometimes things I regretted and reverted after a lot of pain, like trying to use my initials as a name—I had been assuming that my gender problems were not the same as those of people who were actually transgender, because the standard narrative said that that was about people whose "internal sense of their own gender does not match their assigned sex at birth", whereas my thing was obviously at least partially an outgrowth of my weird sex fantasy. I had never interpreted the beautiful pure sacred self-identity thing as an "internal sense of my own gender."

Why would I? In the English of my youth, "gender" was understood as a euphemism for sex for people who were squeamish about the potential ambiguity between sex-as-in-biological-sex and sex-as-in-intercourse. (Judging by this blog's domain name, I'm not immune to this, either.) In that language, my "gender"—my sex—is male. Not because I'm necessarily happy about it (and I used to be pointedly insistent that I wasn't), but as an observable biological fact that, whatever my beautiful pure sacred self-identity feelings, I am not delusional about.

Okay, so trans people aren't delusional about their developmental sex. Rather, the claim is that their internal sense of their own gender should take precedence. So where does that leave me? In "Sexual Dimorphism ...", I wrote about my own experiences. I mentioned transgenderedness a number of times, but I tried to cast it as an explanation that one might be tempted to apply to my case, but which I don't think fits. Everything I said is consistent with Ray Blanchard being dumb and wrong when he coined "autogynephilia" (sometimes abbreviated as AGP) as the obvious and perfect word for my thing while studying actual transsexuals—a world where my idiosyncratic weird sex perversion and associated beautiful pure sacred self-identity feelings are taxonomically and etiologically distinct from whatever brain-intersex condition causes actual trans women. That's the world I thought I lived in for ten years after encountering the obvious and perfect word.

My first clue that I wasn't living in that world came from—Eliezer Yudkowsky. (Well, not my first clue. In retrospect, there were lots of clues. My first wake-up call.) In a 26 March 2016 Facebook post, he wrote—

I'm not sure if the following generalization extends to all genetic backgrounds and childhood nutritional backgrounds. There are various ongoing arguments about estrogenlike chemicals in the environment, and those may not be present in every country ...

Still, for people roughly similar to the Bay Area / European mix, I think I'm over 50% probability at this point that at least 20% of the ones with penises are actually women.

(!?!?!?!?)

A lot of them don't know it or wouldn't care, because they're female-minds-in-male-bodies but also cis-by-default (lots of women wouldn't be particularly disturbed if they had a male body; the ones we know as 'trans' are just the ones with unusually strong female gender identities). Or they don't know it because they haven't heard in detail what it feels like to be gender dysphoric, and haven't realized 'oh hey that's me'. See, e.g., https://sinesalvatorem.tumblr.com/post/141690601086/15-regarding-the-4chan-thing-4chans and https://slatestarcodex.com/2013/02/18/typical-mind-and-gender-identity/

Reading that post, I did realize "oh hey that's me"—it's hard to believe that I'm not one of the "20% of the ones with penises"—but I wasn't sure how to reconcile that with the "are actually women" characterization, coming from the guy who taught me how blatantly, ludicrously untrue and impossible that is.

But I'm kinda getting the impression that when you do normalize transgender generally and MtF particularly, like not "I support that in theory!" normalize but "Oh hey a few of my friends are transitioning and nothing bad happened to them", there's a hell of a lot of people who come out as trans.

If that starts to scale up, we might see a really, really interesting moral panic in 5–10 years or so. I mean, if you thought gay marriage was causing a moral panic, you just wait and see what comes next ...

Indeed—here we are over seven years later, and I am panicking.1 As 2007–9 Sequences-era Yudkowsky taught me, and 2016 Facebook-shitposting-era Yudkowsky seemed to ignore, the thing that makes a moral panic really interesting is how hard it is to know you're on the right side of it—and the importance of panicking sideways in cases like this, where the "maximize the number of trans people" and "minimize the number of trans people" coalitions are both wrong.

At the time, this was merely very confusing. I left a careful comment in the Facebook thread, quietly puzzled at what Yudkowsky could be thinking.

A casual friend I'll call "Thomas"2 messaged me, complimenting me on my comment.

"Thomas" was a fellow old-time Less Wrong reader I had met back in 'aught-nine, while I was doing an "internship"3 for what was then still the Singularity Institute for Artificial Intelligence.4

Relevantly, "Thomas" was also autogynephilic (and aware of it, under that name). The first time I had ever gone crossdressing in public was at a drag event with him in 2010.

As it happened, I had messaged him a few days earlier, on 22 March 2016, for the first time in four and a half years. I confided to him that I was seeing an escort on Saturday the twenty-sixth5 because the dating market was looking hopeless, I had more money than I knew what to do with, and three female friends agreed that it was not unethical.

(I didn't have sex with her, obviously. That would be unethical.6)

He had agreed that seeing escorts is ethical—arguably more ethical than casual sex. In the last few years, he had gotten interested in politics and become more socially and sexually conservative. "Free love is a lie," he said, noting that in a more traditional Society, our analogues would probably be married with kids by now.

Also, his gender dysphoria had receded. "At a certain point, I just cut my hair, give away a lot of clothes, and left it behind. I kept waiting to regret it ... but the regret never came," he said. "It's like my brain got pushed off the fence and subtly re-wired."

I had said that I was happy for him and respected him, even while my own life remained pro-dysphoria, pro-ponytails, and anti-politics.

"Thomas" said that he thought Yudkowsky's post was irresponsible because virtually all of the men in Yudkowsky's audience with gender dysphoria were probably autogynephilic. He went on:

To get a little paranoid, I think the power to define other people's identities is extremely useful in politics. If a political coalition can convince you that you have a persecuted identity or sexuality and it will support you, then it owns you for life, and can conscript you for culture wars and elections. Moloch would never pass up this level of power, so that means a constant stream of bad philosophy about identity and sexuality (like trans theory).

So when I see Eliezer trying to convince nerdy men that they are actually women, I see the hand of Moloch.7

We chatted for a few more minutes. I noted Samo Burja's comment on Yudkowsky's post as a "terrible thought" that had also occurred to me: Burja had written that the predicted moral panic may not be along the expected lines, if an explosion of MtFs were to result in trans women dominating previously sex-reserved spheres of social competition. "[F]or signaling reasons, I will not give [the comment] a Like", I added parenthetically.8

A few weeks later, I moved out of my mom's house in Walnut Creek to an apartment on the correct side of the Caldecott tunnel, in Berkeley, closer to other people in the robot-cult scene and with a shorter train ride to my coding dayjob in San Francisco.

(I would later change my mind about which side of the tunnel is the correct one.)

While I was waiting for internet service to be connected in my new apartment, I read a paper copy of Nevada by Imogen Binnie. It's about a trans woman in who steals her girlfriend's car to go on a cross-country road trip, and ends up meeting an autogynephilic young man whom she tries to convince that autogynephilia is a bogus concept and that he's actually trans.

In Berkeley, I met interesting people who seemed similar to me along a lot of dimensions, but also very different along other dimensions having to do with how they were currently living their life—much like how the characters in Nevada immediately recognize each other as similar but different. (I saw where Yudkowsky got that 20% figure from.)

This prompted me to do more reading in corners of the literature that I had heard of, but hadn't taken seriously in my twelve years of reading everything I could about sex and gender and transgender and feminism and evopsych. (Kay Brown's blog, On the Science of Changing Sex, was especially helpful.)

Between the reading, and a series of increasingly frustrating private conversations, I gradually became increasingly persuaded that Blanchard wasn't dumb and wrong—that his taxonomy of male-to-female transsexuality is basically correct, at least as a first approximation. So far this story has been about my experience, not anyone's theory of transsexuality (which I had assumed for years couldn't possibly apply to me), so let me take a moment to explain the theory now.

(With the caveated understanding that psychology is complicated and there's a lot to be said about what "as a first approximation" is even supposed to mean, but I need a few paragraphs to first talk about the simple version of the theory that makes pretty good predictions on average, as a prerequisite for more complicated theories that might make even better predictions including on cases that diverge from average.)

The theory was put forth by Blanchard in a series of journal articles in the late 'eighties and early 'nineties, and popularized (to some controversy) by J. Michael Bailey in the popular-level book The Man Who Would Be Queen. The idea is that male-to-female transsexuality isn't one phenomenon; it's two completely different phenomena that don't have anything to do with each other, except for the potential treatments of hormone therapy, surgery, and social transition. (Compare to how different medical conditions might happen to respond to the same drug.)

In one taxon, the "early-onset" type, you have same-sex-attracted males who have been extremely feminine (in social behavior, interests, &c.) since to early childhood, in a way that causes social problems for them—the far tail of effeminate gay men who end up fitting into Society better as straight women. Blanchard called them "homosexual transsexuals", which is sometimes abbreviated as HSTS. That's where the "woman trapped inside a man's body" trope comes from. This one probably is a brain-intersex condition.

That story is pretty intuitive. Were an alien AI to be informed that, among humans, some fraction of males elect to undergo medical interventions to resemble females and be perceived as females socially, "brain-intersex condition such that they already behave like females" would probably be its top hypothesis, just on priors.

But suppose our alien AI were to be informed that many of the human males seeking to become female do not fit the clinical profile of the early-onset type: it looks like there's a separate "late-onset" type or types, of males who didn't exhibit discordantly sex-atypical behavior in childhood, but later reported a desire to change sex. If you didn't have enough data to prove anything, but you had to guess, what would be your second hypothesis for how this desire might arise?

What's the usual reason for males to be obsessed with female bodies?

Basically, I think a substantial majority of trans women under modern conditions in Western countries are, essentially, guys like me who were less self-aware about what the thing actually is. It's not an innate gender identity; it's a sexual orientation that's surprisingly easy to misinterpret as a gender identity.

I realize this is an inflammatory and (far more importantly) surprising claim. If someone claims to have an internal sense of her gender that doesn't match her assigned sex at birth, on what evidence could I possibly have the arrogance to reply, "No, I think you're really just a perverted male like me"?

Actually, lots. To arbitrarily pick one exhibit, in April 2018, the /r/MtF subreddit, which then had over 28,000 subscribers, posted a link to a poll: "Did you have a gender/body swap/transformation 'fetish' (or similar) before you realized you were trans?". The results: 82% of over 2000 respondents said Yes. Top comment in the thread, with over 230 karma: "I spent a long time in the 'it's probably just a fetish' camp."

Certainly, 82% is not 100%. Certainly, you could argue that Reddit has a sampling bias such that poll results and karma scores from /r/MtF fail to match the distribution of opinion among real-world MtFs. But if you don't take the gender-identity story as an axiom and actually look at what people say and do, these kinds of observations are not hard to find. You could fill an entire subreddit with them (and then move it to independent platforms when the original gets banned for "promoting hate").

Reddit isn't scientific enough for you? Fine. The scientific literature says the same thing. Blanchard 1985: 73% of not exclusively androphilic transsexuals acknowledged some history of erotic cross-dressing. (A lot of the classic studies specifically asked about cross-dressing, but the underlying desire isn't about clothes; Jack Molay coined the term crossdreaming, which seems more apt.) Lawrence 2005: of trans women who had female partners before sexual reassignment surgery, 90% reported a history of autogynephilic arousal. Smith et al. 2005: 64% of non-homosexual MtFs (excluding the "missing" and "N/A" responses) reported arousal while cross-dressing during adolescence. (A lot of the classic literature says "non-homosexual", which is with respect to natal sex; the idea is that self-identified bisexuals are still in the late-onset taxon.) Nuttbrock et al. 2011: lifetime prevalence of transvestic fetishism among non-homosexual MtFs was 69%. (For a more detailed literature review, see Kay Brown's blog, Phil Illy's book Autoheterosexual: Attracted to Being the Opposite Sex, or the first two chapters of Anne Lawrence's Men Trapped in Men's Bodies: Narratives of Autogynephilic Transsexualism.)

Peer-reviewed scientific papers aren't enough for you? (They could be cherry-picked; there are lots of scientific journals, and no doubt a lot of bad science slips through the cracks of the review process.) Want something more indicative of a consensus among practitioners? Fine. The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, the definitive taxonomic handbook of the American Psychiatric Association, says the same thing in its section on gender dysphoria:

In both adolescent and adult natal males, there are two broad trajectories for development of gender dysphoria: early onset and late onset. Early-onset gender dysphoria starts in childhood and continues into adolescence and adulthood; or, there is an intermittent period in which the gender dysphoria desists and these individuals self-identify as gay or homosexual, followed by recurrence of gender dysphoria. Late-onset gender dysphoria occurs around puberty or much later in life. Some of these individuals report having had a desire to be of the other gender in childhood that was not expressed verbally to others. Others do not recall any signs of childhood gender dysphoria. For adolescent males with late-onset gender dysphoria, parents often report surprise because they did not see signs of gender dysphoria in childhood. Adolescent and adult natal males with early-onset gender dysphoria are almost always sexually attracted to men (androphilic). Adolescents and adults with late-onset gender dysphoria frequently engage in transvestic behavior with sexual excitement.

(Bolding mine.)

Or consider Anne Vitale's "The Gender Variant Phenomenon—A Developmental Review", which makes the same observations as Blanchard and friends, and arrives at the same two-type taxonomy, but dresses it up in socially-desirable language—

As sexual maturity advances, Group Three, cloistered gender dysphoric boys, often combine excessive masturbation (one individual reported masturbating up to 5 and even 6 times a day) with an increase in secret cross-dressing activity to release anxiety.

Got that? They often combine excessive masturbation with an increase in secret cross-dressing activity to release anxiety—their terrible, terrible gender expression deprivation anxiety!

Don't trust scientists or clinicians? Me neither! (Especially not clinicians.) Want first-person accounts from trans women themselves? Me too! And there's lots!

Consider these excerpts from economist Deirdre McCloskey's memoir Crossing, written in the third person about her decades identifying as a heterosexual crossdresser before transitioning at age 53 (bolding mine):

He had been doing it ten times a month through four decades, whenever possible, though in the closet. The quantifying economist made the calculation: About five thousand episodes. [...] At fifty-two Donald accepted crossdressing as part of who he was. True, if before the realization that he could cross all the way someone had offered a pill to stop the occasional cross-dressing, he would have accepted, since it was mildly distracting—though hardly time consuming. Until the spring of 1995 each of the five thousand episodes was associated with quick, male sex.

Or consider this passage from Julia Serano's Whipping Girl (I know I keep referencing this book, but it's so representative of the dominant strain of trans activism, and I'm never going to get over the Fridge Logic of the all the blatant clues that I somehow missed in 2007):

There was also a period of time when I embraced the word "pervert" and viewed my desire to be female as some sort of sexual kink. But after exploring that path, it became obvious that explanation could not account for the vast majority of instances when I thought about being female in a nonsexual context.

"It became obvious that explanation could not account." I don't doubt Serano's reporting of her own phenomenal experiences, but "that explanation could not account" is not an experience; it's a hypothesis about psychology, about the causes of the experience. I don't expect anyone to be able to get that sort of thing right from introspection alone!

Or consider Nevada. This was a popular book, nominated for a 2014 Lambda Literary Award—and described by the author as an attempt to write a story about trans women for an audience of trans women. In Part 2, Chapter 23, our protagonist, Maria, rants about the self-evident falsehood and injustice of autogynephilia theory. And she starts out by ... acknowledging the phenomenon which the theory is meant to explain:

But the only time I couldn't lie to myself about who I wanted to be, and how I wanted to be, and like, the way I needed to exist in the world if I was going to actually exist in the world, is when I was jacking off.

[...]

I was thinking about being a girl while I jacked off, she says, Like, as soon as I started jacking off. For years I thought it was because I was a pervert, that I had this kink I must never, ever tell anyone about, right?

If the idea that most non-androphilic trans women are guys like me is so preposterous, then why do people keep recommending this book?

I could go on ... but do I need to? After having seen enough of these laughable denials of autogynephilia, the main question in my mind has become less, "Is the two-type androphilic/autogynephilic taxonomy of MtF transsexuality approximately true?" (answer: yes, obviously) and more, "How dumb do you (proponents of gender-identity theories) think we (the general public) are?" (answer: very, but correctly).

An important caveat: different causal/etiological stories could be compatible with the same descriptive taxonomy. You shouldn't confuse my mere ridicule with a rigorous critique of the strongest possible case for "gender expression deprivation anxiety" as a theoretical entity, which would be more work. But hopefully I've shown enough work here, that the reader can empathize with the temptation to resort to ridicule?

Everyone's experience is different, but the human mind still has a design. If I hurt my ankle while running and I (knowing nothing of physiology or sports medicine) think it might be a stress fracture, a competent doctor is going to ask followup questions to pin down whether it's a stress fracture or a sprain. I can't be wrong about the fact that my ankle hurts, but I can easily be wrong about why my ankle hurts.

Even if human brains vary more than human ankles, the basic epistemological principle applies to a mysterious desire to be female. The question I need to answer is, Do the trans women whose reports I'm considering have a relevantly different psychological condition than me, or do we have "the same" condition, but (at least) one of us is misdiagnosing it?

The safe answer—the answer that preserves everyone's current stories about themselves—is "different." That's what I thought before 2016. I think a lot of trans activists would say "the same". And on that much, we can agree.

How weaselly am I being with these "approximately true" and "as a first approximation" qualifiers and hedges? I claim: not more weaselly than anyone who tries to reason about psychology given the knowledge our civilization has managed to accumulate.

Psychology is complicated; every human is their own unique snowflake, but it would be impossible to navigate the world using the "every human is their own unique maximum-entropy snowflake; you can't make any probabilistic inferences about someone's mind based on your experiences with other humans" theory. Even if someone were to verbally endorse something like that—and at age sixteen, I might have—their brain is still going to make predictions about people's behavior using some algorithm whose details aren't available to introspection. Much of this predictive machinery is instinct bequeathed by natural selection (because predicting the behavior of conspecifics was useful in the environment of evolutionary adaptedness), but some of it is the cultural accumulation of people's attempts to organize their experience into categories, clusters, diagnoses, taxons.

There could be situations in psychology where a good theory (not perfect, but as good as our theories about how to engineer bridges) would be described by (say) a 70-node causal graph, but that some of the more important variables in the graph anti-correlate with each other. Humans who don't know how to discover the correct 70-node graph, still manage to pattern-match their way to a two-type typology that actually is better, as a first approximation, than pretending not to have a theory. No one matches any particular clinical-profile stereotype exactly, but the world makes more sense when you have language for theoretical abstractions like "comas" or "depression" or "bipolar disorder"—or "autogynephilia".9

I claim that femininity and autogynephilia are two such anti-correlated nodes in the True Causal Graph. They're negatively correlated because they're both children of the sexual orientation node, whose value pushes them in opposite directions: gay men are more feminine than straight men,10 and autogynephiles want to be women because we're straight.

Sex-atypical behavior and the scintillating but ultimately untrue thought are two different reasons why transition might seem like a good idea to someone—different paths through the causal graph leading the decision to transition. Maybe they're not mutually exclusive, and no doubt there are lots of other contributing factors, such that an overly strict interpretation of the two-type taxonomy is false. If an individual trans woman swears that she doesn't match the feminine/early-onset type, but also doesn't empathize with the experiences I've grouped under "autogynephilia", I don't have any proof with which to accuse her of lying, and the true diversity of human psychology is no doubt richer and stranger than my fuzzy low-resolution model.

But the fuzzy low-resolution model is way too good not to be pointing to some regularity in the real world, and honest people who are exceptions that aren't well-predicted by the model, should notice how well it performs on the non-exceptions. If you're a magical third type of trans woman (where magical is a term of art indicating phenomena not understood) who isn't super-feminine but whose identity definitely isn't ultimately rooted in a fetish, you should be confused by the 230 upvotes on that /r/MtF comment about the "it's probably just a fetish" camp. If the person who wrote that comment has experiences like yours, why did they single out "it's probably just a fetish" as a hypothesis to pay attention to in the first place? And there's a whole "camp" of these people?!

I do have a lot of uncertainty about what the True Causal Graph looks like, even if it seems obvious that the two-type taxonomy coarsely approximates it. Gay femininity and autogynephilia are important nodes in the True Graph, but there's going to be more detail to the whole story: what other factors influence people's decision to transition, including incentives and cultural factors specific to a given place and time?

In our feminist era, cultural attitudes towards men and maleness differ markedly from the overt patriarchy of our ancestors. It feels gauche to say so, but as a result, conscientious boys taught to disdain the crimes of men may pick up an internalized misandry. I remember one night at the University in Santa Cruz back in 'aught-seven, I had the insight that it was possible to make generalizations about groups of people while allowing for exceptions—in contrast to my previous stance that generalizations about people were always morally wrong—and immediately, eagerly proclaimed that men are terrible.

Or consider computer scientist Scott Aaronson's account that his "recurring fantasy, through this period, was to have been born a woman, or a gay man [...] [a]nything, really, other than the curse of having been born a heterosexual male, which [...] meant being consumed by desires that one couldn't act on or even admit without running the risk of becoming an objectifier or a stalker or a harasser or some other creature of the darkness."

Or there's a piece that has made the rounds on social media more than once: "I Am A Transwoman. I Am In The Closet. I Am Not Coming Out", which (in part) discusses the author's frustration at being dismissed on account of being perceived as a cis male. "I hate that the only effective response I can give to 'boys are shit' is 'well I'm not a boy,'" the author laments. And: "Do I even want to convince someone who will only listen to me when they're told by the rules that they have to see me as a girl?"

(The "told by the rules that they have to see me" phrasing in the current revision is telling; the originally published version said "when they find out I'm a girl".)11

If boys are shit, and the rules say that you have to see someone as a girl if they say they're a girl, that provides an incentive on the margin to disidentify with maleness.

This culturally transmitted attitude could intensify the interpretation of autogynephilic attraction as an ego-syntonic beautiful pure sacred self-identity thing, and plausibly be a source of gender dysphoria in males who aren't autogynephilic at all.

In one of my notebooks from 2008, I had written, "It bothers me that Richard Feynman went to strip clubs. I wish Richard Feynman had been trans." I guess the sentiment was that male sexuality is inherently exploitative and Bad, but being trans is morally pure and Good; I wanted Famous Science Raconteur to be Good rather than Bad.

But the reason strip clubs are considered Bad is the same as the reason single-sex locker rooms, hospital wards, &c. were, until recently, considered an obvious necessity: no woman should be forced to undergo the indignity of being exposed in the presence of men. It would have been more scandalous if Feynman had violated the sanctity of women's spaces. Is it supposed to be an improvement if physics-nerd incels who might have otherwise gone to strip clubs, instead declare themselves women? Why? Who is the misandry helping, exactly? Or rather, I could maybe see a case for the misandry serving some useful functions, but not if you're allowed to self-identify out of it.

To the extent it's common for "cognitive" things like internalized misandry to manifest as cross-gender identification, then maybe the two-type taxonomy isn't androphilic/autogynephilic so much as it is androphilic/"not otherwise specified": the early-onset type is behaviorally distinct and has a straightforward motive to transition (in some ways, it would be more weird not to). In contrast, it might not be as easy to distinguish autogynephilia from other sources of gender problems in the grab-bag of all males showing up to the gender clinic for any other reason.

Whatever the True Causal Graph looks like, I think I have more than enough evidence to reject the mainstream "inner sense of gender" story.

The public narrative about transness is obviously, obviously false. That's a problem, because almost no matter what you want, true beliefs are more useful than false beliefs for making decisions that get you there.

Fortunately, Yudkowsky's writing had brought together a whole community of brilliant people dedicated to refining the art of human rationality—the methods of acquiring true beliefs and using them to make decisions that get you what you want. Now I knew the public narrative was obviously false, and I had the outlines of a better theory, though I didn't pretend to know what the social policy implications were. All I should have had to do was carefully explain why the public narrative is delusional, and then because my arguments were so much better, all the intellectually serious people would either agree with me (in public), or be eager to clarify (in public) exactly where they disagreed and what their alternative theory was so that we could move the state of humanity's knowledge forward together, in order to advance the great common task of optimizing the universe in accordance with humane values.

Of course, this is a niche topic—if you're not a male with this psychological condition, or a woman who doesn't want to share female-only spaces with them, you probably have no reason to care—but there are a lot of males with this psychological condition around here! If this whole "rationality" subculture isn't completely fake, then we should be interested in getting the correct answers in public for ourselves.

(It later turned out that this whole "rationality" subculture is completely fake, but I didn't realize this at the time.)

Straight men who fantasize about being women do not particularly resemble actual women! We just—don't? This seems kind of obvious, really? Telling the difference between fantasy and reality is kind of an important life skill?! Notwithstanding that some males might want to use medical interventions like surgery and hormone replacement therapy to become facsimiles of women as far as our existing technology can manage, and that a free and enlightened transhumanist Society should support that as an option—and notwithstanding that she is obviously the correct pronoun for people who look like women—it's going to be harder for people to figure out what the optimal decisions are if no one is ever allowed to use language like "actual women" that clearly distinguishes the original thing from imperfect facsimiles?!

I think most people in roughly my situation (of harboring these gender feelings for many years, thinking that it's obviously not the same thing as being "actually trans", and later discovering that it's not obviously not the same thing) tend to conclude that they were "actually trans" all along, and sometimes express intense bitterness at Ray Blanchard and all the other cultural forces of cisnormativity that let them ever doubt.

I ... went the other direction. In slogan form: "Holy crap, almost no one is actually trans!"

Okay, that slogan isn't right. I'm a transhumanist. I believe in morphological freedom. If someone wants to change sex, that's a valid desire that Society should try to accommodate as much as feasible given currently existing technology. In that sense, anyone can choose to become trans.

The problem is that the public narrative of trans rights doesn't seem to be about making a principled case for morphological freedom, or engaging with the complicated policy question of what accommodations are feasible given the imperfections of currently existing technology. Instead, we're told that everyone has an internal sense of their own gender, which for some people (who "are trans") does not match their assigned sex at birth.

Okay, but what does that mean? Are the things about me that I've been attributing to autogynephilia actually an internal gender identity, or did I get it right the first time? How could I tell? No one seems interested in clarifying!

My shift in belief, from thinking the standard narrative is true about other people but not me, to thinking that the narrative is just a lie, happened gradually over the course of 2016 as the evidence kept piling up—from my reading, from correspondence with the aforementioned Kay Brown—and also as I kept initiating conversations with local trans women to try to figure out what was going on.

Someone I met at the Berkeley Less Wrong meetup who went by Ziz12 denied experiencing autogynephilia at all, and I believe her—but it seems worth noting that Ziz was unusual along a lot of dimensions. Again, I don't think a psychological theory needs to predict every case to be broadly useful for understanding the world.

In contrast, many of the people I talked to seemed to report similar experiences to me (at least, to the low resolution of the conversation; I wasn't going to press people for the specific details of their sexual fantasies) but seemed to me to be either pretty delusional, or privately pretty sane but oddly indifferent to the state of public knowledge.

One trans woman told me that autogynephilia is a typical element of cis woman sexuality. (This, I had learned, was a standard cope, but one I have never found remotely plausible.) She told me that if I don't feel like a boy, I'm probably not one. (Okay, but again, what does that mean? There needs to be some underlying truth condition for that "probably" to point to. If it's not sex and it's not sex-atypical behavior, then what is it?)

Another wrote a comment in one discussion condemning "autogynephilia discourse" and expressing skepticism at the idea that someone would undergo a complete medical and social transition because of a fetish: it might be possible, she admitted, but it must be extremely rare. Elsewhere on the internet, the same person reported being into and aroused by gender-bender manga at the time she was first seriously questioning her gender identity.

Was it rude of me to confront her on the contradiction in her PMs? Yes, it was extremely rude. All else being equal, I would prefer not to probe into other people's private lives and suggest that they're lying to themselves. But when they lie to the public, that affects me, and my attempts to figure out my life. Is it a conscious political ploy, I asked her, or are people really unable to entertain the hypothesis that their beautiful pure self-identity feelings are causally related to the fetish? If it was a conscious political ploy, I wished someone would just say, "Congratulations, you figured out the secret, now keep quiet about it or else," rather than trying to undermine my connection to reality.

She said that she had to deal with enough invalidation already, that she had her own doubts and concerns but would only discuss them with people who shared her views. Fair enough—I'm not entitled to talk to anyone who doesn't want to talk to me.

I gave someone else a copy of Men Trapped in Men's Bodies: Narratives of Autogynephilic Transsexualism. She didn't like it—which I would have respected, if her complaint had just been that Lawrence was overconfident and overgeneralizing, as a factual matter of science and probability. But my acquaintance seemed more preoccupied with how the book was "seemingly deliberately hurtful and disrespectful", using "inherently invalidating language that is very often used in people's dismissal, abuse, and violence towards trans folk", such as calling MtF people "men", referring to straight trans women as "homosexual", and using "transgendered" instead of "transgender". (I would have hoped that the fact that Lawrence is trans and (thinks she) is describing herself would have been enough to make it credible that she didn't mean any harm by saying "men" instead of "a.m.a.b."—and that it should have been obvious that if you reject authors out of hand for not speaking in your own ideology's shibboleths, you lose an important chance to discover if your ideology is getting something wrong.)

The privately sane responses were more interesting. "People are crazy about metaphysics," one trans woman told me. "That's not new. Compare with transubstantiation and how much scholarly work went in to trying to square it with natural materialism. As for causality, I think it's likely that the true explanation will not take the shape of an easily understood narrative."

Later, she told me, "It's kind of funny how the part where you're being annoying isn't where you're being all TERFy and socially unacceptable, but where you make very strong assumptions about truth due to being a total nerd and positivist—mind you, the vast majority of times people deviate from this the consequences are terrible."

Someone else I talked to was less philosophical. "I'm an AGP trans girl who really likes anime, 4chan memes, and the like, and who hangs around a lot with ... AGP trans girls who like anime, 4chan memes, and the like," she said. "It doesn't matter to me all that much if some specific group doesn't take me seriously. As long as trans women are pretty OK at respectability politics and cis people in general don't hate us, then it's probably not something I have to worry about."


I made friends with a trans woman whom I'll call "Helen." My flatmate and I let her crash at our apartment for a few weeks while she was looking for more permanent housing.

There's a certain—dynamic, that can exist between self-aware autogynephilic men, and trans women who are obviously in the same taxon (even if they don't self-identify as such). From the man's end, a mixture of jealousy and brotherly love and a blackmailer's smugness, twisted together in the unspoken assertion, "Everyone else is supposed to politely pretend you're a woman born in the wrong body, but I know the secret."

And from the trans woman's end—I'm not sure. Maybe pity. Maybe the blackmail victim's fear.

One day, "Helen" mentioned having executive-dysfunction troubles about making a necessary telephone call to the doctor's office. The next morning, I messaged her:

I asked my counterfactual friend Zelda how/whether I should remind you to call the doctor in light of our conversation yesterday. "If she was brave enough to self-actualize in the first place rather than cowardly resign herself to a lifetime of dreary servitude to the cistem," she said counterfactually, "—unlike some people I could name—", she added, counterfactually glaring at me, "then she's definitely brave enough to call the doctor at some specific, predetermined time today, perhaps 1:03 p.m."

"The 'vow to call at a specific time' thing never works for me when I'm nervous about making a telephone call," I said. The expression of contempt on her counterfactual face was withering. "Obviously the technique doesn't work for boys!"

I followed up at 1:39 p.m., while I was at my dayjob:

"And then at one-thirty or so, you message her saying, 'There, that wasn't so bad, was it?' And if the call had already been made, it's an affirming comment, but if the call hadn't been made, it functions as a social incentive to actually call in order to be able to truthfully reply 'yeah' rather than admit to still being paralyzed by telephone anxiety."

"You always know what to do," I said. "Nothing like me. It's too bad you're only—" I began to say, just as she counterfactually said, "It's a good thing you're only a figment of my imagination."

"Helen" replied:

i'm in the middle of things. i'll handle it before they close at 5 though, definitely.

I wrote back:

"I don't know," I murmured, "a lot of times in the past when I told myself that I'd make a phone call later, before some place closed, it later turned out that I was lying to myself." "Yeah, but that's because you're a guy. Males are basically composed of lies, as a consequence of https://en.wikipedia.org/wiki/Bateman%27s_principle. Don't worry about ['Helen']."

Or I remember one night we were talking in the living room. I think she was sad about something, and I said—

(I'm not saying I was right to say it; I'm admitting that I did say it)

—I said, "Can I touch your breasts?" and she said, "No," and nothing happened.

I would have never said that to an actual ("cis") woman in a similar context—definitely not one who was staying at my house. This was different, I felt. I had reason to believe that "Helen" was like me, and the reason it felt ethically okay to ask was because I was less afraid of hurting her—that whatever evolutionary-psychological brain adaptation women have to be especially afraid of males probably wasn't there.


I talked about my autogynephilia to a (cis) female friend over Messenger. It took some back-and-forth to explain the concept.

I had mentioned "misdirected heterosexuality"; she said, "Hm, so, like, you could date girls better if you were a girl?"

No, I said, it's weirder than that; the idea of having female anatomy oneself and being able to appreciate it from the first person is intrinsically more exciting than the mere third-person appreciation that you can do in real life as a man.

"[S]o, like, literal autogynephilia is a thing?" she said (as if she had heard the term before, but only as a slur or fringe theory, not as the obvious word for an obviously existing thing).

She mentioned that as a data point, her only effective sex fantasy was her as a hot girl. I said that I expected that to be a qualitatively different phenomenon, based on priors, and—um, details that it would probably be creepy to talk about.

So, she asked, I believed that AGP was a real thing, and in my case, I didn't have lots of desires to be seen as a girl, have a girl name, &c.?

No, I said, I did; it just seemed like it couldn't have been a coincidence that my beautiful pure sacred self-identity thing (the class of things including the hope that my beautiful–beautiful ponytail successfully sets me apart from the guys who are proud of being guys, or feeling happy about getting ma'am'ed over the phone) didn't develop until after puberty.

She said, "hm. so male puberty was a thing you did not like."

No, I said, puberty was fine—it seemed like she was rounding off my self-report to something closer to the standard narrative, but what I was trying to say was that the standard was-always-a-girl-in-some-metaphysical-sense narrative was not true (at least for me, and I suspected for many others).

"The thing is, I don't think it's actually that uncommon!" I said, linking to "Changing Emotions" (the post from Yudkowsky's Sequences explaining why this not-uncommon male fantasy would be technically difficult to fulfill). "It's just that there's no script for it and no one wants to talk about it!"

[redacted] — 09/02/2016 1:23 PM
ok, very weird
yeah, I just don't have a built-in empathic handle for "wants to be a woman."
Zack M. Davis — 09/02/2016 1:24 PM
it even has a TVTrope! http://tvtropes.org/pmwiki/pmwiki.php/Main/ManIFeelLikeAWoman
[redacted] — 09/02/2016 1:27 PM
ok, yeah. wow. it's really just easier for my brain to go "ok, that's a girl" than to understand why anyone would want boobs

I took this as confirmation of my expectation that alleged "autogynephilia" in women is mostly not a thing—that normal women appreciating their own bodies is a qualitatively distinct phenomenon. When she didn't know what I was talking about, my friend mentioned that she also fantasized about being a hot girl. After I went into more detail (and linked the TVTropes page), she said she didn't understand why anyone would want boobs. Well, why would she? But I think a lot of a.m.a.b. people understand.


As the tension continued to mount through mid-2016 between what I was seeing and hearing, and the socially-acceptable public narrative, my frustration started to subtly or not-so-much leak out into my existing blog, but I wanted to write more directly about what I thought was going on.

At first, I was imagining a post on my existing blog, but a couple of my very smart and cowardly friends recommended a pseudonym, which I reluctantly agreed was probably a good idea. I came up with "M. Taylor Saotome-Westlake" as a pen name and started this blog (with loving attention to technology choices, rather than just using WordPress). I'm not entirely without misgivings about the exact naming choices I made, although I don't actively regret it the way I regret my attempted nickname switch in the late 'aughts.13

The pseudonymity quickly became a joke—or rather, a mere differential-visibility market-segmentation pen name and not an Actually Secret pen name, like how everyone knows that Robert Galbraith is J. K. Rowling. It turned out that my need for openness and a unified social identity was far stronger than my grasp of what my very smart and cowardly friends think is prudence, such that I ended up frequently linking to and claiming ownership of the blog from my real name, and otherwise leaking entropy through a sieve on this side.

I kept the Saotome-Westlake byline because, given the world of the current year (such that this blog was even necessary), I figured it was probably a smarter play if the first page of my real-name Google search results wasn't my gender and worse heterodoxy blog. Plus, having made the mistake (?) of listening to my very smart and cowardly friends at the start, I'd face a backwards-compatibility problem if I wanted to unwind the pseudonym: there were already a lot of references to this blog being written by Saotome-Westlake, and I didn't want to throw away or rewrite that history. (The backwards-compatibility problem is also one of several reasons I'm not transitioning.)

It's only now, just before publishing the first parts of this memoir telling my Whole Dumb Story, that I've decided to drop the pseudonym—partially because this Whole Dumb Story is tied up in enough real-world drama that it would be dishonorable and absurd to keep up the charade of hiding my own True Name while speaking so frankly about other people, and partially because my financial situation has improved (and my timelines to transformative AI have deteriorated) to the extent that the risk of missing out on job opportunities due to open heterodoxy seems comparatively unimportant.

(As it happens, Andrea James's Transgender Map website mis-doxxed me as someone else, so I guess the charade worked?)


Besides writing to tell everyone else about it, another consequence of my Blanchardian enlightenment was that I decided to try hormone replacement therapy (HRT). Not to actually socially transition, which seemed as impossible (to actually pull off) and dishonest (to try) as ever, but just to try as a gender-themed drug experiment. Everyone else was doing it—why should I have to miss out just for being more self-aware?

Sarah Constantin, a friend who once worked for our local defunct medical research company still offered lit reviews as a service, so I paid her $5,000 to do a post about the effects of feminizing hormone replacement therapy on males, in case the depths of the literature had any medical insight to offer that wasn't already on the informed-consent paperwork. Meanwhile, I made the requisite gatekeeping appointments with my healthcare provider to get approved for HRT, first with a psychologist I had seen before, then with a couple of licensed clinical social workers (LCSW).

I was happy to sit through the sessions as standard procedure rather than going DIY, but I was preoccupied with how everyone had been lying to me about the most important thing in my life for fourteen years and the professionals were in on it, and spent a lot of the sessions ranting about that. I gave the psychologist and one of the LCSWs a copy of Men Trapped in Men's Bodies: Narratives of Autogynephilic Transsexualism. (The psychologist said she wasn't allowed to accept gifts with a monetary value of over $25, so I didn't tell her it cost $40.)

I got the sense that the shrinks didn't quite know what to make of me. Years later, I was grateful to discover that the notes from the appointments were later made available to me via the provider's website (despite this practice introducing questionable incentives for the shrinks going forward); it's amusing to read about (for example) one of the LCSWs discussing my case with the department director and "explor[ing] ways in which pt's [patient's] neurodiversity may be impacting his ability to think about desired gender changes and communicate to therapists".

The reality was actually worse than my hostile summary that everyone was lying, and the professionals were in on it. In some ways, it would be better if the professionals secretly agreed with me about the typology and were cynically lying in order to rake in that sweet pharma cash. But they're not—lying. They just have this whole paradigm of providing "equitable" and "compassionate" "gender-affirming care". This is transparently garbage-tier epistemology (for a belief that needs to be affirmed is not a belief at all), but it's so pervasive within its adherents' milieu, that they're incapable of seeing someone not buying it, even when you state your objections very clearly.

Before one of my appointments with the LCSW, I wrote to the psychologist to express frustration about the culture of lying, noting that I needed to chill out and get to a point of emotional stability before starting the HRT experiment. (It's important to have all of one's ducks in a row before doing biochemistry experiments on the ducks.) She wrote back:

I agree with you entirely, both about your frustration with people wanting to dictate to you what you are and how you feel, and with the importance of your being emotionally stable prior to starting hormones. Please explain to those who argue with you that it is only YOUR truth that matter when it comes to you, your body and what makes you feel whole. No one else has the right to dictate this.

I replied:

I'm not sure you do! I know condescending to patients is part of your usual script, but I hope I've shown that I'm smarter than that. This solipsistic culture of "it is only YOUR truth that matters" is exactly what I'm objecting to! People can have false beliefs about themselves! As a psychologist, you shouldn't be encouraging people's delusions; you should be using your decades of study and experience to help people understand the actual psychological facts of the matter so that they can make intelligent choices about their own lives! If you think the Blanchard taxonomy is false, you should tell me that I'm wrong and that it's false and why!

Similarly, the notes from my first call to the gender department claim that I was "exploring gender identity" and that I was "interested in trying [hormones] for a few months to see if they fit with his gender identity". That's not how I remember that conversation! I distinctly remember asking if the department would help me if I wanted to experiment with HRT without socially transitioning: that is, I was asking if they would provide medical services not on the basis of "gender identity". Apparently my existence is so far out-of-distribution that the nurse on the phone wasn't capable of writing down what I actually said.

However weird I must have seemed, I have trouble imagining what anyone else tells the shrinks, given the pile of compelling evidence summarized earlier that most trans women are, in fact, guys like me. If I wanted to, I could cherry-pick from my life to weave a more congruent narrative about always having been a girl on the inside. (Whatever that means! It still seems kind of sexist for that to mean something!) As a small child, I asked for (and received, because I had good '90s liberal parents) Polly Pocket, and a pink and purple girl's scooter with heart decals. I could talk about how sensitive I am. I could go on about my beautiful pure sacred self-identity thing ...

But (as I told the LCSW) I would know that I was cherry-picking. HSTS-taxon boys are identified as effeminate by others. You know it when you see it, even when you're ideologically prohibited from knowing that you know. That's not me. I don't even want that to be me. I definitely have a gender thing, but I have a pretty detailed model of what I think the thing is in the physical universe, and my model doesn't fit in the ever-so-compassionate and -equitable ontology of "gender identity", which presupposes that what's going on when I report wishing I were female is the same thing as what's going on with actual women who (objectively correctly) report being female. I don't think it's the same thing, and I think you'd have to be crazy or a liar to say it is.

I could sympathize with patients in an earlier era of trans healthcare who felt that they had no choice but to lie—to conform to the doctors' conception of a "true transsexual" on pain of being denied treatment.

This was not the situation I saw on the ground in the Bay Area of 2016. If a twentieth-century stalemate of patients lying to skeptical doctors had congealed into a culture of scripted conformity, why had it persisted long after the doctors stopped being skeptical and the lies served no remaining purpose? Why couldn't everyone just snap out of it?


Another consequence of my Blanchardian enlightenment was my break with progressive morality. I had never really been progressive, as such. (I was registered to vote as a Libertarian, the legacy of a teenage dalliance with Ayn Rand and the greater libertarian blogosphere.) But there was still an embedded assumption that, as far as America's culture wars went, I was unambiguously on the right (i.e., left) side of history, the Blue Team and not the Red Team.

Even after years of devouring heresies on the internet—I remember fascinatedly reading everything I could about race and IQ in the wake of the James Watson affair back in 'aught-seven—I had never really questioned my coalitional alignment. With some prompting from "Thomas", I was starting to question it now.

Among many works I had skimmed in the process of skimming lots of things on the internet, was the neoreactionary blog Unqualified Reservations, by Curtis Yarvin, then writing as Mencius Moldbug. The Unqualified Reservations archives caught my renewed interest in light of my recent troubles.

Moldbug paints a picture in which, underneath the fiction of "democracy", the United States is better modeled as an oligarchic theocracy ruled by universities and the press and the civil service. The apparent symmetry between the Democrats and Republicans is fake: the Democrats represent an alliance of the professional–managerial ruling class and their black and Latino underclass clients; the Republicans, representing non-elite whites and the last vestiges of the old ruling elite, can sometimes demagogue their way into high offices, but the left's ownership of the institutions prevents them "conserving" anything for very long.

The reason it ended up this way is because power abhors a vacuum: if you ostensibly put the public mind in charge of the state, that just creates an incentive for power-seeking agents to try to control the public mind. If you have a nominal separation of church and state, but all the incentives that lead to the establishment of a state religion in other Societies are still in play, you've just created selection pressure for a de facto state religion that sheds the ideological trappings of "God" in favor of "progress" and "equality" in order to sidestep the Establishment Clause. People within the system are indoctrinated into a Whig history which holds that people in the past were bad, bad men, but that we're so much more enlightened now in the progress of time. But the progress of time isn't sensitive to what's better; it only tracks what won.

Moldbug contends that the triumph of progressivism is bad insofar as the oligarchic theocracy, for all its lofty rhetoric, is structurally incapable of good governance: it's not a coincidence that all functional non-government organizations are organized as monarchies, with an owner or CEO14 who has the joint authority and responsibility to hand down sane decisions rather than being hamstrung by the insanity of politics (which, as Moldbug frequently notes, is synonymous with democracy).

(Some of Moldbug's claims about the nature of the American order that seemed outlandish or crazy when Unqualified Reservations was being written in the late 'aughts and early 'tens, now seem much more credible after Trump and Brexit and the summer of George Floyd. I remember that in senior year of high school back in 'aught-five, on National Coming Out Day, my physics teacher said that she was coming out as a Republican. Even then, I got the joke, but I didn't realize the implications.)

In one part of his Gentle Introduction to Unqualified Reservations, Moldbug compares the social and legal status of black people in the contemporary United States to hereditary nobility (!!).

Moldbug asks us to imagine a Society with asymmetric legal and social rules for nobles and commoners. It's socially deviant for commoners to be rude to nobles, but permitted for nobles to be rude to commoners. Violence of nobles against commoners is excused on the presumption that the commoners must have done something to provoke it. Nobles are officially preferred in employment and education, and are allowed to organize to advance their collective interests, whereas any organization of commoners qua commoners is outlawed or placed under extreme suspicion.

Moldbug claims that the status of non-Asian minorities in contemporary America is analogous to that of the nobles in his parable. But beyond denouncing the system as unfair, Moldbug furthermore claims that the asymmetric rules have deleterious effects on the beneficiaries themselves:

applied to the cream of America's actual WASP–Ashkenazi aristocracy, genuine genetic elites with average IQs of 120, long histories of civic responsibility and productivity, and strong innate predilections for delayed gratification and hard work, I'm confident that this bizarre version of what we can call ignoble privilege would take no more than two generations to produce a culture of worthless, unredeemable scoundrels. Applied to populations with recent hunter-gatherer ancestry and no great reputation for sturdy moral fiber, noblesse sans oblige is a recipe for the production of absolute human garbage.

This was the sort of right-wing heresy that I could read about on the internet (as I read lots of things on the internet without necessarily agreeing), and see the argument abstractly, without putting any serious weight on it.

It wasn't my place. I'm not a woman or a racial minority; I don't have their lived experience; I don't know what it's like to face the challenges they face. So while I could permissibly read blog posts skeptical of the progressive story about redressing wrongs done to designated sympathetic victim groups, I didn't think of myself as having standing to seriously doubt the story.

Until suddenly, in what was then the current year of 2016, it was now seeming that the designated sympathetic victim group of our age was ... straight boys who wished they were girls. And suddenly, I had standing.

When a political narrative is being pushed for your alleged benefit, it's much easier to make the call that it's obviously full of lies. The claim that political privileges are inculcating "a culture of worthless, unredeemable scoundrels" in some other group is easy to dismiss as bigotry, but it hits differently when you can see it happening to people like you. Notwithstanding whether the progressive story had been right about the travails of Latinos, blacks, and women, I know that straight boys who wish they were girls are not actually as fragile and helpless as we were being portrayed—that we weren't that fragile, if anyone still remembered the world of 'aught-six, when straight boys who wished they were girls knew that the fantasy wasn't real and didn't think the world owed us deference for our perversion. This did raise questions about whether previous iterations of progressive ideology had been entirely honest with me. (If nothing else, I noticed that my update from "Blanchard is probably wrong because trans women's self-reports say it's wrong" to "Self-reports are pretty crazy" probably had implications for "Red Pill is probably wrong because women's self-reports say it's wrong".)


While I was in this flurry of excitement about my recent updates and the insanity around me, I thought back to that Yudkowsky post from back in March that had been my wake-up call to all this. ("I think I'm over 50% probability at this point that at least 20% of the ones with penises are actually women"!)

I wasn't friends with Yudkowsky, of course; I didn't have a natural social affordance to just ask him the way you would ask a dayjob or college acquaintance something. But he had posted about how he was willing to accept money to do things he otherwise wouldn't in exchange for enough money to feel happy about the trade—a Happy Price, or Cheerful Price, as the custom was later termed—and his schedule of happy prices listed $1,000 as the price for a 2-hour conversation. I had his email address from previous contract work I had done for MIRI a few years before, so on 29 September 2016, I wrote him offering $1,000 to talk about what kind of massive update he made on the topics of human psychological sex differences and MtF transsexuality sometime between January 2009 and March of the current year, mentioning that I had been "feeling baffled and disappointed (although I shouldn't be) that the rationality community is getting this really easy scientific question wrong" (Subject: "Happy Price offer for a 2 hour conversation").

At this point, any normal people who are (somehow?) reading this might be thinking, isn't that weird and kind of cultish? Some blogger you follow posted something you thought was strange earlier this year, and you want to pay him one grand to talk about it? To the normal person, I would explain thusly—

First, in our subculture, we don't have your weird hangups about money: people's time is valuable, and paying people money to use their time differently than they otherwise would is a perfectly ordinary thing for microeconomic agents to do. Upper-middle-class normal people don't blink at paying a licensed therapist $100 to talk for an hour, because their culture designates that as a special ritualized context in which paying money to talk to someone isn't weird. In my culture, we don't need the special ritualized context; Yudkowsky just had a higher rate than most therapists.

Second, $1000 isn't actually real money to a San Francisco software engineer.

Third—yes. Yes, it absolutely was kind of cultish. There's a sense in which, sociologically and psychologically speaking, Yudkowsky is a religious leader, and I was—am—a devout adherent of the religion he made up.

By this, I don't mean that the content of Yudkowskian rationalism is comparable to (say) Christianity or Buddhism. But whether or not there is a god or a divine (there is not), the features of human psychology that make Christianity or Buddhism adaptive memeplexes are still going to be active. If the God-shaped hole in my head can't not be filled by something, it's better to fill it with a "religion" about good epistemology, one that can reflect on the fact that beliefs that are adaptive memeplexes are often false. It seems fair to compare my tendency to write in Sequences links to a devout Christian's tendency to quote Scripture by chapter and verse; the underlying mental motion of "appeal to the canonical text" is probably pretty similar. My only defense is that my religion is actually true (and says you should read the texts and think it through for yourself, rather than taking anything on faith).

That's the context in which my happy-price email thread ended up including the sentence, "I feel awful writing Eliezer Yudkowsky about this, because my interactions with you probably have disproportionately more simulation-measure than the rest of my life, and do I really want to spend that on this topic?" (Referring to the idea that, in a sufficiently large universe with many subjectively indistinguishable copies of everyone, including inside of future superintelligences running simulations of the past, there would plausibly be more copies of my interactions with Yudkowsky than of other moments of my life, on account of that information being of greater decision-relevance to those superintelligences.)

I say all this to emphasize just how much Yudkowsky's opinion meant to me. If you were a devout Catholic, and something in the Pope's latest encyclical seemed wrong according to your understanding of Scripture, and you had the opportunity to talk it over with the Pope for a measly $1000, wouldn't you take it?

I don't think I should talk about the results of my cheerful-price inquiry (whether a conversation occured, or what was said if it did), because any conversation would be protected by the privacy rules that I'm holding myself to in telling this Whole Dumb Story.

(Incidentally, it was also around this time that I snuck a copy of Men Trapped in Men's Bodies into the MIRI office library, which was sometimes possible for community members to visit. It seemed like something Harry Potter-Evans-Verres would do—and ominously, I noticed, not like something Hermione Granger would do.)


If I had to pick a date for my break with progressive morality, it would be 7 October 2017. Over the past few days, I had been having a frustrating Messenger conversation with some guy, which I would later describe as feeling like I was talking to an AI designed to maximize the number of trans people. He didn't even bother making his denials cohere with each other, insisting with minimal argument that my ideas were wrong and overconfident and irrelevant and harmful to talk about. When I exasperatedly pointed out that fantasizing about being a woman is not the same thing as literally already being a woman, he replied, "Categories were made for man, not man for the categories", referring to a 2014 Slate Star Codex post which argued that the inherent subjectivity of drawing category boundaries justified acceptance of trans people's identities.

Over the previous weeks and months, I had been frustrated with the zeitgeist, but I was trying to not to be loud or obnoxious about it, because I wanted to be a good person and not hurt anyone's feelings and not lose any more friends. ("Helen" had rebuffed my last few requests to chat or hang out. "I don't fully endorse the silence," she had said, "just find talking vaguely aversive.")

This conversation made it very clear to me that I could have no peace with the zeitgeist. It wasn't the mere fact that some guy in my social circle was being dumb and gaslighty about it. It was the fact that his performance was an unusually pure distillation of socially normative behavior in Berkeley 2016: there were more copies of him than there were of me.

Opposing this was worth losing friends, worth hurting feelings—and, actually, worth the other thing. I posted on Facebook in the morning and on my real-name blog in the evening:

the moment of liberating clarity when you resolve the tension between being a good person and the requirement to pretend to be stupid by deciding not to be a good person anymore 💖

Former MIRI president Michael Vassar emailed me about the Facebook post, and we ended up meeting once. (I had also emailed him back in August, when I had heard from my friend Anna Salamon that he was also skeptical of the transgender movement (Subject: "I've heard of fake geek girls, but this is ridiculous").)


I wrote about my frustrations to Scott Alexander of Slate Star Codex fame (Subject: "J. Michael Bailey did nothing wrong"). The immediate result was that he ended up including a link to one of Kay Brown's study summaries (and expressing surprise at the claim that non-androphilic trans woman have very high IQs) in his November 2016 links post. He got some pushback even for that.


A trans woman named Sophia commented on one of my real-name blog posts, thanking me for the recommendation of Men Trapped in Men's Bodies. "It strongly spoke to many of my experiences as a trans woman that I've been treating as unmentionable. (Especially among my many trans friends!)" she wrote. "I think I'm going to start treating them as mentionable."

We struck up an email correspondence. She had found my blog from the Slate Star Codex blogroll. She had transitioned in July of the previous year at age 35, to universal support. (In Portland, which was perhaps uniquely good in this way.)

I said I was happy for her—probably more so than the average person who says that—but that (despite living in Berkeley, which was perhaps uniquely in contention with Portland for being perhaps uniquely good in this way) there were showstopping contraindications to social transition in my case. It really mattered what order you learn things in. The 2016 zeitgeist had the back of people who model themselves as women who were assigned male at birth, but not people who model themselves as men who love women and want to become what they love. If you first realize, "Oh, I'm trans," and then successfully transition, and then read Anne Lawrence, you can say, "Huh, seems plausible that my gender identity was caused by my autogynephilic sexuality rather than the other way around," shrug, and continue living happily ever after. In contrast, I had already been thinking of myself as autogynephilic (but not trans) for ten years. Even in Portland or Berkeley, you still have to send that coming-out email, and I couldn't claim to have a "gender identity" with a straight face.

Sophia said she would recommend Men Trapped in Men's Bodies on her Facebook wall. I said she was brave—well, we already knew she was brave because she actually transitioned—but, I suggested, maybe it would be better to wait until October 11th?

To help explain why she thought transitioning is more feasible than I did, she suggested, a folkloric anti-dysphoria exercise: look at women you see in public, and try to pick out which features /r/gendercritical would call out in order to confirm that she's obviously a man.

I replied that "obviously a man" was an unsophisticated form of trans-skepticism. I had been thinking of gendering in terms of naïve Bayes models: you observe some features, use those to assign (probabilities of) category membership, and then use category membership to make predictions about whatever other features you might care about but can't immediately observe. Sure, it's possible for an attempted clocking to be mistaken, and you can have third-gender categories such that AGP trans women aren't "men"—but they're still not drawn from anything close to the same distribution as cis women.

Sophia replied with an information-theoretic analysis of passing, which I would later adapt into a guest post for this blog. If the base rate of AGP transsexuality in Portland was 0.1%, someone would need log2(99.9%/0.1%) ≈ 9.96 ≈ 10 bits of evidence to clock her as trans. If one's facial structure was of a kind four times more likely to be from a male than a female, that would only contribute 2 bits. Sophia was 5′7″, which is about where the female and male height distributions cross over, so she wasn't leaking any bits there. And so on—the prospect of passing in naturalistic settings is a different question from whether there exists evidence that a trans person is trans. There is evidence—but as long as it's comfortably under 10 bits, it won't be a problem.

I agreed that for most people in most everyday situations it probably didn't matter. I cared because I was a computational philosophy of gender nerd, I said, linking to a program I had written to simulate sex classification based on personality, using data from a paper by Weisberg et al. about sex differences in correlated "facets" underlying the Big Five personality traits. (For example, studies had shown that women and men didn't differ in Big Five Extraversion, but if you split "Extraversion" into "Enthusiasm" and "Assertiveness", there were small sex differences pointing in opposite directions, with men being more assertive.) My program generated random examples of women's and men's personality stats according to the Weisberg et al. data, then tried to classify the "actual" sex of each example given only the personality stats—only reaching 63% accuracy, which was good news for androgyny fans like me.

Sophia had some cutting methodological critiques. The paper had given residual statistics of each facet against the other—like the mean and standard deviation of Enthusiasm minus Assertiveness—so I assumed you could randomly generate one facet and then use the residual stats to get a "diff" from one to the other. Sophia pointed out that you can't use residuals for sampling like that, because the actual distribution of the residual was highly dependent on the first facet. Given an unusually high value for one facet, taking the overall residual stats as independent would imply that the other facet was equally likely to be higher or lower, which was absurd.

(For example, suppose that "height" and "weight" are correlated aspect of a Bigness factor. Given that someone's weight is +2σ—two standard deviations heavier than the mean—it's not plausible that their height is equally likely to be +1.5σ and +2.5σ, because the former height is more than seven times more common than the latter; the second facet should regress towards the mean.)

Sophia built her own model in Excel using the correlation matrix from the paper, and found a classifier with 68% accuracy.


On the evening of 10 October 2016, I put up my Facebook post for Coming Out Day:

Happy Coming Out Day! I'm a male with mild gender dysphoria which is almost certainly causally related to my autogynephilic sexual/romantic orientation, which I am genuinely proud of! This has no particular implications for how other people should interact with me!

I believe that late-onset gender dysphoria in males is almost certainly not an intersex condition. (Here "late-onset" is a term of art meant to distinguish people like me from those with early-onset gender dysphoria, which is characterized by lifelong feminine behavior and a predominantly androphilic sexual orientation. Anne Vitale writes about these as "Group Three" and "Group One" in "The Gender Variant Phenomenon": http://www.avitale.com/developmentalreview.htm ) I think it's important to not let the political struggle to secure people's rights to self-modification interfere with the pursuit of scientific knowledge, because having a realistic understanding of the psychological mechanisms underlying one's feelings is often useful in helping individuals make better decisions about their own lives in accordance with the actual costs and benefits of available interventions (rather than on the basis of some hypothesized innate identity). Even if the mechanisms turn out to not be what one thought they were—ultimately, people can stand what is true.

Because we are already enduring it.

It got 40 Likes—and one comment (from my half-brother, who was supportive but didn't seem to understand what I was trying to do). Afterward, I wondered if I had been too subtle—or whether no one wanted to look like a jerk by taking the bait and starting a political fight on my brave personal self-disclosure post.

But Coming Out Day isn't, strictly, personal. I had self-identified as autogynephilic for ten years without being particularly "out" about it (except during the very unusual occasions when it was genuinely on-topic); the only reason I was making a Coming Out Day post in 2016 and not any of the previous ten years was because the political environment had made it an issue.

In some ways, it was nice to talk about an important part of my life that I otherwise mostly didn't get the opportunity to talk about. But if that had to come in the form of a deluge of lies for me to combat, on net, I preferred the closet.


I messaged an alumna of my App Academy class of November 2013. I remembered that on the first day of App Academy, she had asked about the sexual harassment policy, to which the founder/instructor hesitated and promised to get back to her; apparently, it had never come up before. (This was back when App Academy was still cool and let you sleep on the floor if you wanted.) Later, she started a quarrel with another student (a boy just out of high school, in contrast to most attendees already having a college degree) over something offensive he had said; someone else had pointed out in his defense that he was young. (Young enough not to have been trained not to say anything that could be construed as anti-feminist in a professional setting?)

In short, I wanted to consult her feminism expertise; she seemed like the kind of person who might have valuable opinions on whether men could become women by means of saying so. "[O]n the one hand, I'm glad that other people get to live my wildest fantasy", I said, after explaining my problem, "but on the other hand, maaaaaybe we shouldn't actively encourage people to take their fantasies quite this literally? Maybe you don't want people like me in your bathroom for the same reason you're annoyed by men's behavior on trains?"

She asked if I had read The Man Who Would Be Queen. (I had.) She said she personally didn't care about bathrooms.

She had also read a lot about related topics (in part because of her own history as a gender-nonconforming child), but found this area of it (autogynephilia, &c.) difficult to talk about except from one's lived experience because "the public narrative is very ... singular". She thought that whether and how dysphoria was related to eroticism could be different for different people. She also thought the singular narrative had been culturally important in the same way as the "gay is not a choice" narrative, letting people with less privilege live in a way that makes them happy with less of a penalty. (She did empathize with concern about children being encouraged to transition early; given the opportunity to go to school as a boy at age 7, she would have taken it, and it would have been the wrong path.)

She asked if I was at all suicidal. (I wasn't.)

These are all very reasonable opinions. If I were her (if only!), I'm sure I would believe all the same things. But if so many nice, smart, reasonable liberals privately notice that the public narrative is very singular, and none of them point out that the singular narrative is not true, because they appreciate its cultural importance—doesn't that—shouldn't that—call into question the trustworthiness of the consensus of the nice, smart, reasonable liberals? How do you know what's good in the real world if you mostly live in the world of the narrative?


Of course, not all feminists were of the same mind on this issue. In late December 2016, I posted an introductory message to the "Peak Trans" thread on /r/gendercritical, explaining my problem.

The first comment was "You are a predator."

I'm not sure what I was expecting. I spent part of Christmas Day crying.


At the end of December 2016, my gatekeeping sessions were finished, and I finally started HRT: Climara® 0.05 mg/day estrogen patches, to be applied to the abdomen once a week. The patch was supposed to stay on the entire week despite showering, &c.

Interestingly, the indications listed in the package insert were all for symptoms due to menopause, post-menopause, or "hypogonadism, castration, or primary ovarian failure." If it was commonly prescribed to intact males with an "internal sense of their own gender", neither the drug company nor the FDA seemed to know about it.

In an effort to not let my anti–autogynephilia-denialism crusade take over my life, earlier that month, I promised myself (and published the SHA256 hash of the promise to signal that I was Serious) not to comment on gender issues under my real name through June 2017. That was what my new secret blog was for.


The promise didn't take. There was just too much gender-identity nonsense on my Facebook feed.

"Folks, I'm not sure it's feasible to have an intellectually-honest real-name public conversation about the etiology of MtF," I wrote in one thread in mid-January 2017. "If no one is willing to mention some of the key relevant facts, maybe it's less misleading to just say nothing."

As a result of that, I got a PM from a woman I'll call "Rebecca" whose relationship had fallen apart after (among other things) her partner transitioned. She told me about the parts of her partner's story that had never quite made sense to her (but sounded like a textbook case from my reading). In her telling, he was always more emotionally tentative and less comfortable with the standard gender role and status stuff, but in the way of like, a geeky nerd guy, not in the way of someone feminine. He was into crossdressing sometimes, but she had thought that was just an insignificant kink, not that he didn't like being a man—until they moved to the Bay Area and he fell in with a social-justicey crowd. When I linked her to Kay Brown's article on "Advice for Wives and Girlfriends of Autogynephiles", her response was, "Holy shit, this is exactly what happened with me." It was nice to make a friend over shared heresy.


As a mere heretic, it was also nice to have an outright apostate as a friend. I had kept in touch with "Thomas", who provided a refreshing contrary perspective to the things I was hearing from everyone else. For example, when the rationalists were anxious that the election of Donald Trump in 2016 portended an increased risk of nuclear war, "Thomas" pointed out that Clinton was actually much more hawkish towards Russia, the U.S.'s most likely nuclear adversary.

I shared an early draft of "Don't Negotiate With Terrorist Memeplexes" with him, which fleshed out his idea from back in March 2016 about political forces incentivizing people to adopt an identity as a persecuted trans person.

He identified the "talking like an AI" phenomenon that I mentioned in the post as possession by an egregore, a group-mind holding sway over the beliefs of the humans comprising it. The function of traditional power arrangements with kings and priests was to put an individual human with judgement in the position of being able to tame, control, or at least negotiate with egregores. Individualism was flawed because individual humans couldn't be rational on their own. Being an individualist in an environment full of egregores was like being an attractive woman alone at a bar yelling, "I'm single!"—practically calling out for unaligned entities to wear down your psychological defenses and subvert your will.

Rationalists implicitly seek Aumann-like agreement with perceived peers, he explained: when the other person is visibly unmoved by one's argument, there's a tendency to think, "Huh, they must know something I don't" and update towards their position. Without an understanding of egregoric possession, this is disastrous: the possessed person never budges on anything significant, and the rationalist slowly gets eaten by their egregore.

I was nonplussed: I had heard of patterns of refactored agency, but this was ridiculous. This "egregore" framing was an interesting alternative way of looking at things, but it seemed—nonlocal. There were inhuman patterns in human agency that we wanted to build models of, but it seemed like he was attributing too much agency to the patterns. In contrast, "This idea creates incentives to propogate itself" was a mechanism I understood. (Or was I being like one of those dumb critics of Richard Dawkins who protest that genes aren't actually selfish? We know that; the anthropomorphic language is just convenient.)

I supposed I was modeling "Thomas" as being possessed by the neoreaction egregore, and myself as experiencing a lower (but still far from zero) net egregoric force by listening to both him and the mainstream rationalist egregore.

He was a useful sounding board when I was frustrated with my so-far-mostly-private trans discussions.

"If people with fragile identities weren't useful as a proxy weapon for certain political coalitions, then they would have no incentive to try to play language police and twist people's arms into accepting their identities," he said once.

"OK, but I still want my own breasts," I said.

"[A]s long as you are resisting the dark linguistic power that the left is offering you," he said, with a smiley emoticon.

In some of my private discussions with others, Ozy Brennan (a.f.a.b. nonbinary author of Thing of Things) had been cited as a local authority figure on gender issues: someone asked what Ozy thought about the two-type taxonomy, or wasn't persuaded because they were partially deferring to Ozy, who had been broadly critical of the theory.15 I remarked to "Thomas" that this implied that my goal should be to overthrow Ozy (whom I otherwise liked) as de facto rationalist gender czar.

"Thomas" didn't think this was feasible. The problem, he explained, was that "hypomasculine men are often broken people who idolize feminists, and worship the first one who throws a few bones of sympathy towards men." (He had been in this category, so he could make fun of them.) Thus, the female person would win priestly battles in nerdy communities, regardless of quality of arguments. It wasn't Ozy's fault, really. They weren't power-seeking; they just happened to fulfill a preexisting demand for feminist validation.


In a January 2017 Facebook thread about the mystery of why so many rationalists were trans, "Helen" posited the metacognition needed to identify the strange, subtle unpleasantness of gender dysphoria as a potential explanatory factor.

I messaged her, ostensibly to ask for my spare key back, but really (I quickly let slip) because I was angry about the pompous and deceptive Facebook comment: maybe it wouldn't take so much metacognition if someone would just mention the other diagnostic criterion!

She sent me a photo of the key with half of the blade snapped off next to a set of pliers, sent me $8 (presumably to pay for the key), and told me to go away.

On my next bank statement, her deadname appeared in the memo line for the $8 transaction.


I made plans to visit Portland, for the purpose of meeting Sophia, and two other excuses. There was a fandom convention in town, and I wanted to try playing Pearl from Steven Universe again—but this time with makeup and breastforms and a realistic gem. Also, I had been thinking of obfuscating my location as being part of the thing to do for keeping my secret blog secret, and had correspondingly adopted the conceit of setting my little fictional vignettes in the Portland metropolitan area, as if I lived there.16 I thought it would be cute to get some original photographs of local landmarks (like TriMet trains, or one of the bridges over the Willamette River) to lend verisimilitude to the charade.

In a 4 February 2017 email confirming the plans with Sophia, I thanked her for her earlier promise not to be offended by things I might say, which I was interpreting literally, and without which I wouldn't dare meet her. Unfortunately, I was feeling somewhat motivated to generally avoid trans women now. Better to quietly (except for pseudonymous internet yelling) stay out of everyone's way rather than risk the temptation to say the wrong thing and cause a drama explosion.


The pretense of quietly staying out of everyone's way lasted about three days.

In a 7 February 2017 comment thread on the Facebook wall of MIRI Communications Director Rob Bensinger, someone said something about closeted trans women, linking to the "I Am In The Closet. I Am Not Coming Out" piece.

I objected that surely closeted trans women are cis: "To say that someone already is a woman simply by virtue of having the same underlying psychological condition that motivates people to actually take the steps of transitioning (and thereby become a trans woman) kind of makes it hard to have a balanced discussion of the costs and benefits of transitioning."

(That is, I was assuming "cis" meant "not transitioned", whereas the other commenter seemed to be assuming a gender-identity model, such that guys like me aren't cis.)

Bensinger replied:

Zack, "woman" doesn't unambiguously refer to the thing you're trying to point at, even if no one were socially punishing you for using the term that way, and even if we were ignoring any psychological harm to people whose dysphoria is triggered by that word usage, there'd be the problem regardless that these terms are already used in lots of different ways by different groups. The most common existing gender terms are a semantic minefield at the same time they're a dysphoric and political minefield, and everyone adopting the policy of objecting when anyone uses man/woman/male/female/etc. in any way other than the way they prefer is not going to solve the problem at all.

Bensinger followed up with another comment offering constructive suggestions: say "XX-cluster" when you want to talk about things that correlate with XX chromosomes.

So, this definitely wasn't the worst obfuscation attempt I'd face during this Whole Dumb Story; I of course agree that words are used in different ways by different groups. It's just—I think it should have already been clear from my comments that I understood that words can be used in many ways; my objection to the other commenter's usage was backed by a specific argument about the expressive power of language; Bensinger didn't acknowledge my argument. (The other commenter, to her credit, did.)

To be fair to Bensinger, it's entirely possible that he was criticizing me specifically because I was the "aggressor" objecting to someone else's word usage, and that he would have stuck up for me just the same if someone had "aggressed" against me using the word woman in a sense that excluded non-socially-transitioned gender-dysphoric males, for the same reason ("adopting the policy of objecting when anyone uses man/woman/male/female/etc. in any way other than the way they prefer is not going to solve the problem at all").

But in the social context of Berkeley 2016, I was suspicious that that wasn't actually his algorithm. It is a distortion if socially-liberal people in the current year selectively drag out the "It's pointless to object to someone else's terminology" argument specifically when someone wants to talk about biological sex (or even socially perceived sex!) rather than self-identified gender identity—but objecting on the grounds of "psychological harm to people whose dysphoria is triggered by that word usage" is potentially kosher.

Someone named Ben Hoffman, whom I hadn't previously known or thought much about, put a Like on one of my comments. I messaged him to say hi, and to thank him for the Like, "but maybe it's petty and tribalist to be counting Likes".


Having already started to argue with people under my real name (in violation of my previous intent to save it for the secret blog), the logic of "in for a lamb, in for a sheep" (or "may as well be hung for a pound as a penny") started to kick in. On the evening of Saturday 11 February 2019, I posted to my own wall:

Some of you may have noticed that I've recently decided to wage a suicidally aggressive one-person culture war campaign with the aim of liberating mindshare from the delusional victimhood identity politics mind-virus and bringing it under the control of our familiar "compete for status by signaling cynical self-awareness" egregore! The latter is actually probably not as Friendly as we like to think, as some unknown fraction of its output is counterfeit utility in the form of seemingly cynically self-aware insights that are, in fact, not true. Even if the fraction of counterfeit insights is near unity, the competition to generate seemingly cynically self-aware insights is so obviously much healthier than the competition for designated victimhood status, that I feel good about this campaign being morally correct, even [if] the amount of mindshare liberated is small and I personally don't survive.

I followed it up the next morning with a hastily-written post addressed, "Dear Totally Excellent Rationalist Friends".17 As a transhumanist, I believe that people should get what they want, and that we should have social norms designed to help people get what they want. But fantasizing about having a property (in context, being a woman, but I felt motivated to be vague for some reason) without yet having sought out interventions to acquire the property, is not the same thing as somehow already literally having the property in some unspecified metaphysical sense. The process of attempting to acquire the property does not propagate backwards in time. I realized that explaining this in clear language had the potential to hurt people's feelings, but as an aspiring epistemic rationalist, I had a goddamned moral responsibility to hurt those people's feelings. I was proud of my autogynephilic fantasy life, and proud of my rationalist community, and I didn't want either of them being taken over by crazy people who think they can edit the past.

It got 170 comments, a large fraction of which were me arguing with a woman whom I'll call "Margaret" (with whom I had also had an exchange in the thread on Bensinger's wall on 7 February).

"[O]ne of the things trans women want is to be referred to as women," she said. "This is not actually difficult, we can just do it." She was pretty sure I must have read the relevant Slate Star Codex post, "The Categories Were Made for Man, Not Man for the Categories".

I replied that I had an unfinished draft post about this, but briefly, faced with a demand to alter one's language to spare someone's feelings, one possible response might be to submit to the demand. But another possible response might be: "I don't negotiate with terrorists. People have been using this word to refer to a particular thing for the last 200,000 years since the invention of language, and if that hurts your feelings, that's not my problem." The second response was certainly not very nice. But maybe there were other values than being nice?—sometimes?

In this case, the value being served had to do with there being an empirical statistical structure of bodies and minds in the world that becomes a lot harder to talk about if you insist that everyone gets to define how others perceive them. I didn't like the structure that I was seeing; like many people in my age cohort, and many people who shared my paraphilic sexual orientation, I had an ideological obsession with androgyny as a moral ideal. But the cost of making it harder to talk about the structure might outweigh the benefit of letting everyone dictate how other people should perceive them!

Nick Tarleton asked me to clarify: was I saying that people who assert that "trans women are women" were sneaking in connotations or denotations that were false in light of so many trans women being (I claimed) autogynephilic, even when those people also claimed that they didn't mean anything predictive by "women"?

Yes! I replied. People seemed to be talking as if there were some intrinsic gender-identity switch in the brain, and if a physiological male had the switch in the female position, that meant they Were Trans and needed to transition. I thought that was a terrible model of the underlying psychological condition. I thought we should be talking about clever strategies to maximize the quantity "gender euphoria minus gender dysphoria", and it wasn't at all obvious that full-time transition was the uniquely best solution.

"Margaret" said that what she thought was going on was that I was defining woman as someone who has a female-typical brain or body, but she was defining woman as someone who thinks of themself as a woman or is happier being categorized that way. With the latter definition, the only way someone could be wrong about whether they were a woman would be to try it and find out that they were less happy that way.

I replied: but that was circular, right?—that women are people who are happier being categorized as women. However you chose to define it, your mental associations with the word woman were going to be anchored on your experiences with adult human females. I wasn't saying people couldn't transition! You can transition if you want! I just thought the details were really important!


In another post that afternoon, I acknowledged my right-wing influences. You know, you spend nine years reading a lot of ideologically-inconvenient science, all the while thinking, "Oh, this is just interesting science, you know, I'm not going to let myself get morally corrupted by it or anything." And for the last couple years, you add in some ideologically-inconvenient political thinkers, too.

But I was still a nice good socially-liberal "Free to Be You and Me" gender-egalitarian individualist person. Because I understood the is–ought distinction—unlike some people—I knew that I could learn from people's models of the world without necessarily agreeing with their goals. So I had been trying to learn from the models of these bad people saying the bad things, until one day, the model clicked. And the model was terrifying. And the model had decision-relevant implications for the people who valued the things that I valued—

The thing was, I actually didn't think I had been morally corrupted! I thought I was actually really good at maintaining the is–ought distinction in my mind. But for people who hadn't followed my exact intellectual trajectory, the mere fact that I was saying, "Wait! Stop! The things that you're doing may not in fact be the optimal things!" made it look like I'd been morally corrupted, and there was no easy way for me to prove otherwise.

So, people probably shouldn't believe me. This was just a little manic episode with no serious implications. Right?


Somewhat awkwardly, I had a date scheduled with "Margaret" that evening. The way that happened was that, elsewhere on Facebook, on 7 February, Brent Dill had said that he didn't see the value in the community matchmaking site reciprocity.io, and I disagreed, saying that the hang-out matching had been valuable to me, even if the romantic matching was useless for insufficiently high-status males.

"Margaret" had complained: "again with pretending only guys can ever have difficulties getting dates (sorry for this reaction, I just find this incredibly annoying)". I had said that she shouldn't apologize; I usually didn't make that genre of comment, but it seemed thematically appropriate while replying to Brent (who was locally infamous for espousing cynical views about status and social reality, and not yet locally infamous for anything worse than that).

(And privately, the audacity of trying to spin a complaint into a date seemed like the kind of male-typical stunt that I was starting to consider potentially morally acceptable after all.)

Incidentally, I added, I was thinking of seeing that new Hidden Figures movie if I could find someone to go with? It turned out that she had already seen it, but we made plans to see West Side Story at the Castro Theatre instead.

The date was pretty terrible. We walked around the Castro for a bit continuing to debate the gender thing, then saw the movie. I was very distracted and couldn't pay attention to the movie at all.


I continued to be very distracted the next day, Monday 13 February 2017. I went to my office, but definitely didn't get any dayjob work done.

I made another seven Facebook posts. I'm proud of this one:

So, unfortunately, I never got very far in the Daphne Koller and the Methods of Rationality book (yet! growth m—splat, AUGH), but one thing I do remember is that many different Bayesian networks can represent the same probability distribution. And the reason I've been running around yelling at everyone for nine months is that I've been talking to people, and we agree on the observations that need to be explained, and yet we explain them in completely different ways. And I'm like, "My network has SO MANY FEWER ARROWS than your network!" And they're like, "Huh? What's wrong with you? Your network isn't any better than the standard-issue network. Why do you care so much about this completely arbitrary property 'number of arrows'? Categories were made for the man, not man for the categories!" And I'm like, "Look, I didn't get far enough in the Daphne Koller and the Methods of Rationality book to understand why, but I'm PRETTY GODDAMNED SURE that HAVING FEWER ARROWS MAKES YOU MORE POWERFUL. YOU DELUSIONAL BASTARDS! HOW CAN YOU POSSIBLY GET THIS WRONG please don't hurt me Oh God please don't hurt me I'm sorry I'm sorry."

That is, people are pretty perceptive about what other people are like, as a set of static observations: if prompted appropriately, they know how to anticipate the ways in which trans women are different from cis women. Yet somehow, we couldn't manage to agree about what was "actually" going on, even while agreeing that we were talking about physiological males with male-typical interests and personalities whose female gender identities seem closely intertwined with their gynephilic sexuality.

When factorizing a joint probability distribution into a Bayesian network, you can do it with respect to any variable ordering you want: a graph with a "wet-streets → rain" edge can represent a set of static observations just as well as a graph with a "rain → wet-streets" edge,18 but "unnatural" variable orderings generate a more complicated graph that will give crazy predictions if you interpret it as a causal Bayesian network and use it to predict the results of interventions. Algorithms for learning a network from data prefer graphs with fewer edges as a consequence of Occamian minimum-message-length epistemology:19 every edge is a burdensome detail that requires a corresponding amount of evidence just to locate it in the space of possibilities.

It was as if the part of people that talked didn't have a problem representing their knowledge using a graph generated from a variable ordering that put "biological sex" closer to last than first. I didn't think that was what the True Causal Graph looked like.


In another post, I acknowledged my problematic tone:

I know the arrogance is off-putting! But the arrogance is a really fun part of the æsthetic that I'm really enjoying! Can I get away with it if I mark it as a form of performance art? Like, be really arrogant while exploring ideas, and then later go back and write up the sober serious non-arrogant version?

An a.f.a.b. person came to my defense: it was common to have mental blocks about criticizing trans ideology for fear of hurting trans people (including dear friends) and becoming an outcast. One way to overcome that block was to get really angry and visibly have an outburst. Then, people would ascribe less agency and culpability to you; it would be clear that you'd cooped up these feelings for a long time because you do understand that they're taboo and unpopular.

The person also said it was hard because it seemed like there were no moderate centrists on gender: you could either be on Team "if you ever want to know what genitals someone has for any reason, then you are an evil transphobe", or Team "trans women are disgusting blokes in dresses who are invading my female spaces for nefarious purposes".

I added that the worst part was that the "trans women are disgusting blokes in dresses who are invading my female spaces for nefarious purposes" view was basically correct. It was phrased in a hostile and demeaning manner. But words don't matter! Only predictions matter!

(That is, TERFs who demonize AGP trans women are pointing to an underappreciated empirical reality, even if the demonization isn't warranted, and the validation of a biologically male person's female gender identity undermines the function of a female-only space, even if the male's intent isn't predatory or voyeuristic.)


The thread on the "Totally Excellent Rationalist Friends" post continued. Someone I'll call "Kevin" (whom I had never interacted with before or since; my post visibility settings were set to Public) said that the concept of modeling someone based on their gender seemed weird: any correlations between meaningful psychological traits and gender were weak enough to be irrelevant after talking with someone for half an hour. In light of that, wasn't it reasonable to care more about addressing people in a way that respects their agency and identity?

I replied, but this was circular, right?—that the concept of modeling someone based on their gender seemed weird. If gender didn't have any (probabilistic!) implications, why did getting gendered correctly matter so much to people?

Human psychology is a very high-dimensional vector space. If you've bought into an ideology that says everyone is equal and that sex differences must therefore be small to nonexistent, then you can selectively ignore the dimensions along which sex differences are relatively large, focusing your attention on a subspace in which individual personality differences really do swamp sex differences. But once you notice you're doing this, maybe you can think of clever strategies to better serve the moral ideal that made psychological-sex-differences denialism appealing, while also using the power granted by looking at the whole configuration space?

After more back-and-forth between me and "Kevin", "Margaret" expressed frustration with some inconsistencies in my high-energy presentation. I expressed my sympathies, tagging Michael Vassar (who was then sometimes using "Arc" as a married name):

I'm sorry that I'm being confusing! I know I'm being confusing and it must be really frustrating to understand what I'm trying to say because I'm trying to explore this conceptspace that we don't already have standard language for! You probably want to slap me and say, "What the hell is wrong with you? Talk like a goddamned normal person!" But I forgot hoooooooow!

Michael Arc is this how you feel all the time??

help


In another Facebook post, I collected links to Bailey, Lawrence, Vitale, and Brown's separate explanations of the two-type taxonomy:

The truthful and mean version: The Man Who Would Be Queen, Ch. 9
The truthful and nice version: "Becoming What We Love" http://annelawrence.com/becoming_what_we_love.pdf
The technically-not-lying version: http://www.avitale.com/developmentalreview.htm
The long version: https://sillyolme.wordpress.com/

I got some nice emails from Michael Vassar. "I think that you are doing VERY good work right now!!!" he wrote. "The sort that shifts history! Only the personal is political" (Subject: "Talk like a normal person").

I aptly summed up my mental state with a post that evening:

She had a delusional mental breakdown; you're a little bit manic; I'm in the Avatar state.20

I made plans to visit a friend's house, but before I left the office, I spent some time drafting an email to Eliezer Yudkowsky. I remarked via PM to the friend, "oh, maybe I shouldn't send this email to someone as important as Eliezer". Then, "oh, I guess that means the manic state is fading". Then: "I guess that feeling is the exact thing I'm supposed to be fighting". (Avoiding "crazy" actions like emailing a high-status person wasn't safe in a world where all the high-status people where committed to believing that men could be women by means of saying so.) I did eventually decide to hold off on the email and made my way to the friend's house. "Not good at navigation right now", I remarked.


I stayed up late that night of 13–14 February 2017, continuing to post on Facebook. I'm proud of this post from 12:48 a.m.:

Of course, Lawrence couldn't assume Korzybski as a prerequisite. The reality is (wait for it ...) even worse! We're actually men who love their model of what we wish women were, and want to become that.21

The AGP fantasy about "being a woman" wouldn't—couldn't be fulfilled by magically being transformed to match the female distribution. At a minimum, because women aren't autogynephilic! The male sex fantasy of, "Ooh, what if I inhabited a female body with my own breasts, vagina, &c." has no reason to match anything in the experience of women who always have just been female. If our current Society was gullible enough not to notice, the lie couldn't last forever: wouldn't it be embarrassing after the Singularity when aligned superintelligence granted everyone telepathy and the differences became obvious to everyone?

In "Interpersonal Entanglement" (in the Fun Theory Sequence back in 'aught-nine), Yudkowsky had speculated that gay couples might have better relationships than straights, since gays don't have to deal with the mismatch in desires across sexes. The noted real-life tendency for AGP trans women to pair up with each other is probably partially due to this effect22: the appeal of getting along with someone like you, of having an appropriately sexed romantic partner who behaves like a same-sex friend. The T4T phenomenon is a real-life analogue of "Failed Utopia #4-2", a tantalizing substitute for actual opposite-sex relationships.

The comment thread under the "nice/mean versions" post would eventually end up with 180 comments, a large fraction of which were, again, a thread mostly of me arguing with "Margaret". At the top of the thread (at 1:14 a.m.), she asked if there was something that concisely explained why I believed what I believed, and what consequences it had for people.

I replied (at 1:25 a.m.):

why you believe what you believe

The OP has four cites. What else do you want?

what consequences you think this has for people

Consequences for me: http://unremediatedgender.space/2017/Jan/the-line-in-the-sand-or-my-slippery-slope-anchoring-action-plan/

Consequences for other people: I don't know! That's for those other people to decide, not me! But whatever they decide, they'll probably get more of what they want if they have more accurate beliefs! Rationality, motherfuckers! Do you speak it!

(Looking back on the thread over six years later, I'm surprised by the timestamps. What were we all doing, having a heated political discussion at half past one in the morning? We should have all been asleep! If I didn't yet appreciate the importance of sleep, I would soon learn.)

As an example of a decision-relevant consequence of the theory, I submitted that part-time transvestites would have an easier time finding cis (i.e., actual) woman romantic partners than trans women. As an illustrative case study, even Julia Serano apparently couldn't find a cis girlfriend (and so someone who wasn't a high-status activist would do even worse).

"Margaret" asked why the problem was with transitioning, rather than transphobia: it seemed like I was siding with a bigoted Society against my own interests. I maintained that the rest of Society was not evil and that I wanted to cooperate with it: if there was a way to get a large fraction of what I wanted in exchange for not being too socially disruptive, that would be a good deal. "Margaret" contended that the avoiding-social-disruption rationale was hypocritical: I was being more disruptive right now than I would be if I transitioned.

"Rebecca" took my side in the thread, and explained why she was holding "Margaret" to a different standard of discourse than me: I was walking into this after years of personal, excruciating suffering, and was willing to pay the social costs to present a model. My brashness should have been more forgivable in light of that—that I was ultimately coming from a place of compassion and hope for people, not hate.

I messaged "Rebecca": "I wouldn't call it 'personal, excruciating suffering', but way to play the victim card on my behalf". She offered to edit it. I declined: "if she can play politics, we can play politics??"

"Rebecca" summed up something she had gotten out of my whole campaign:

"Rebecca" — 02/14/2016 3:26 AM
I really was getting to the point that I hated transwomen
Zack M. Davis — 02/14/2016 3:26 AM
I hate them, too!
Fuck those guys!
"Rebecca" — 02/14/2016 3:27 AM
I hated what happened to [my partner], I hate the insistence that I use the right pronouns and ignore my senses, I hate the takeover of women's spaces, I hate the presumption that they know what a woman's life is like, I was getting to the point that I deeply hated them, and saw them as the enemy
But you're actually changing that for me
You're reconnecting me with my natural compassion
To people who are struggling and have things that are hard
It's just that, the way they think things is hard is not the way I actually think it is anymore
Zack M. Davis — 02/14/2016 3:28 AM
the "suffering" is mostly game-theoretic victimhood-culture
"Rebecca" — 02/14/2016 3:28 AM
You've made me hate transwomen less now
Because I have a model
I understand the problem
Zack M. Davis — 02/14/2016 3:28 AM
http://unremediatedgender.space/2017/Feb/if-other-fantasies-were-treated-like-crossdreaming/
"Rebecca" — 02/14/2016 3:28 AM
I understand why it's hard
I feel like I can forgive it, to the extent that forgiveness is mine to give
This is a better thing for me
I did not want to be a hateful person
I did not want to take seeming good people as an enemy in my head, while trying to be friends with them in public
I think now I can do it more honestly
They might not want me as a friend
But now I feel less threatened and confused and insulted
And that has dissolved the hatred that was starting to take root
I'm very grateful for that

I continued to stay up and post—and email.

At 3:30 a.m., I sent an email to Scott Alexander (Subject: "a bit of feedback"):

In the last hour of the world before this is over, as the nanobots start consuming my flesh, I try to distract myself from the pain by reflecting on what single blog post is most responsible for the end of the world. And the answer is obvious: "The Categories Were Made for the Man, Not Man for the Categories." That thing is a fucking Absolute Denial Macro!

At 4:18 a.m., I pulled the trigger on the email I had started drafting to Yudkowsky earlier (Subject: "the spirit of intervention"), arguing that Moldbug and neoreactionaries were onto something really important. It wasn't about politics per se; it was about reflectivity and moral progress skepticism. Instead of assuming that we know better than people in the past, we should look at the causal processes that produced our current morality, and reevaluate whether it makes sense (in light of our current morality, which was itself created those same causal processes). Insofar as we could see that the egalitarian strain of our current morality was shaped by political forces rather than anything more fundamental, it was worth reëvaluating. It wasn't that right-wing politics are good as such. More like, being smart is more important than being good (for humans), so if you abandon your claim to goodness, you can think more clearly.

A couple of hours later, I was starting to realize I had made a mistake. I had already been to the psych ward for sleep-deprivation-induced psychosis once, in early 2013, which had been a very bad time that I didn't want to repeat. I suddenly realized, about three to six hours too late, that I was in danger of repeating it, as reflected in emails sent to Anna Salamon at 6:16 a.m. (Subject: "I love you and I'm scared and I should sleep to aboid [sic] being institutionalized") and to Michael Vassar 6:32 a.m. (Subject: "I'm scared and I can't sleep but I need to sleep to avoid being institutionalized and I want to be a girl but I am not literally a girl obviously you delusional bastards (eom)").

Michael got back to me at 10:37 a.m.:

I'm happy to help in any way you wish. Call any time. [...] I think that you are right enough that it actually calls for the creation of something with the authority to purge/splinter the rationalist community. There is no point in having a rationalist community where you get ignored and silenced if you talk politely and condemned for not using the principle of charity by people who literally endorse trying to control your thoughts and bully you into traumatic surgery by destroying meaning in language. We should interpret ["Margaret"] and ["Kevin"], in particular, as violent criminals armed with technology we created and act accordingly.

Records suggest that I may have gotten as much as an hour and a half of sleep that afternoon: in an email to Anna at 2:22 p.m., I wrote, "I don't know what's real. I should lie down? I'm sorry", and in a message to Ben Hoffman at 4:09 p.m., I wrote, "I just woke up". According to my records, I hung out with Ben; I have no clear memories of this day.

That night, I emailed Michael and Anna about sleep at 12:17 a.m. 15 February 2017 (Subject: "Can SOMEONE HELP ME I REALLY NEED TO FIGURE OUT HOW TO SLEEP THIS IS DANGEROUS") and about the nature and amount of suffering in the universe at 1:55 a.m. and 2:01 a.m. (Subjects: "I think I'm starting to understand a lot of the stuff you used to say that I didn't understand!" and "none of my goddamned business").

I presumably eventually got some sleep that night. In the morning, I concluded my public Facebook meltdown with three final posts. "I got even more sleep and feel even more like a normal human! Again, sorry for the noise!" said the first. Then: "Arguing on the internet isn't that important! Feel free to take a break!" In the third post, I promised to leave Facebook for a week. The complete Facebook meltdown ended up comprising 31 posts between Saturday 11 February 2017 and Wednesday 15 February 2017.


In retrospect, I was not, entirely, feeling like a normal human.

Specifically, this is the part where I started to go crazy—when the internet-argument-induced hypomania (which was still basically in touch with reality) went over the edge into a stress- and sleep-deprivation–induced psychotic episode, resulting in my serving three days in psychiatric jail (sorry, "hospital"; they call it a "hospital") and then having a relapse two months later, culminating in my friends taking turns trip-sitting me in a hotel room at the local My Little Pony fan convention until I got enough sleep to be reasonably non-psychotic.

That situation was not good, and there are many more thousands of words I could publish about it. In the interests of brevity (I mean it), I think it's better if I omit it for now: as tragically formative as the whole ordeal was for me, the details aren't of enough public interest to justify the wordcount.

This wasn't actually the egregious part of the story. (To be continued.)


  1. Or rather, I did panic from mid-2016 to mid-2021, and this and the following posts are a memoir telling the Whole Dumb Story, written in the ashes of my defeat. 

  2. In this and the following posts, personal names that appear in quotation marks are pseudonyms. 

  3. The Singularity Institute at the time was not the kind of organization that offered formal internships; what I mean is that there was a house in Santa Clara where a handful of people were trying to do Singularity-relevant work, and I was allowed to sleep in the garage and also try to do work, without being paid. 

  4. The "for Artificial Intelligence" part was a holdover from the organization's founding, from before Yudkowsky decided that AI would kill everyone by default. People soon started using "SingInst" as an abbreviation more than "SIAI", until the organization was eventually rebranded as the Machine Intelligence Research Institute (MIRI) in 2013. 

  5. Writing this up years later, I was surprised to see that my date with the escort was the same day as Yudkowsky's "20% of the ones with penises" post. They hadn't been stored in my long-term episodic memory as "the same day," likely because the Facebook post only seems overwhelmingly significant in retrospect; at the time, I did not realize what I would be spending the next seven years of my life on. 

  6. To be clear, this is not a call for prohibition of sex work, but rather, an expression of ethical caution: if you have empirical or moral uncertainty about whether someone who might provide you a service is being morally-relevantly coerced into it, you might decline to buy that service, and I endorse being much more conservative about these judgements in the domain of sex than for retail or factory work (even though cuddling and nudity apparently managed to fall on the acceptable side of the line).

    A mitigating factor in this case is that she had a blog where she wrote in detail about how much she liked her job. The blog posts seemed like credible evidence that she wasn't being morally-relevantly coerced into it. Of course all women in that profession have to put up marketing copy that makes it sound like they enjoy their time with their clients even if they privately hate it, but the blog seemed "real", not part of the role. 

  7. The references to "Moloch" are presumably an allusion to Scott Alexander's "Meditations on Moloch", in which Alexander personifies coordination failures as the pagan deity Moloch

  8. This was brazen cowardice. Today, I would notice that if "for signaling reasons", people don't Like comments that make insightful and accurate predictions about contemporary social trends, then subscribers to our collective discourse will be less prepared for a world in which those trends have progressed further. 

  9. In some sense it's a matter of "luck" when the relevant structure in the world happens to simplify so much. For example, friend of the blog Tailcalled argues that there's no discrete typology for FtM as there is for the two types of MtF, because gender problems in females vary more independently and aren't as stratified by age. 

  10. It's a stereotype for a reason! If you're not satisfied with stereotypes and want Science, see Lippa 2000 or Bailey and Zucker 1995

  11. The original version also says, "I begin to show an interest in programming, which might be the most obvious sign so far," alluding to the popular stereotype of the trans woman programmer. But software development isn't a female-typical profession! (5.17% of respondents to the 2022 Stack Overflow developer survey were women.) It's almost as if ... people instinctively know that trans women are a type of man? 

  12. Ziz wrote about her interactions with me in her memoir and explicitly confirmed with me on 5 November 2019 that we weren't under any confidentiality agreements with each other, so it seems fine for me to name her here. 

  13. For the pen name: a hyphenated last name (a feminist tradition), first-initial + gender-neutral middle name (as if suggesting a male ineffectually trying to avoid having an identifiably male byline), "Saotome" from a thematically relevant Japanese graphic novel series, "West" (+ an extra syllable) after a character in Scott Alexander's serial novel Unsong whose catchphrase is "Somebody has to and no one else will".

    For the blog name: I had already imagined that if I ever stooped to the depravity of starting one of those transformation/bodyswap captioned-photo erotica blogs, I would call it The Titillating But Ultimately Untrue Thought, and in fact had already claimed ultimatelyuntruethought@gmail.com in 2014, to participate in a captioning contest, but since this was to be a serious autogynephilia science blog, rather than tawdry object-level autogynephilia blogging, I picked "Scintillating" as a more wholesome adjective. In retrospect, it may have been a mistake to choose a URL different from the blog's title—people seem to remember the URL (unremediatedgender.space) more than the title, and to interpret the "space" TLD as a separate word (a space for unremediated gender), rather than my intent of "genderspace" being a compound term analogous to "configuration space". But it doesn't bother me that much. 

  14. Albeit possibly supervised by a board of directors who can fire the leader but not meddle in day-to-day operations. 

  15. Although the fact that Ozy had commented on the theory at all—which was plausibly causally downstream from me yelling at everyone in private—was probably net-positive for the cause; there's no bad publicity for new ("new") ideas. I got a couple of reply pieces out of their engagement in the early months of this blog. 

  16. Beaverton, referenced in "The Counter", is a suburb of Portland; the Q Center referenced in "Title Sequence" does exist in Portland and did have a Gender Queery support group, although the vignette was inspired by my experience with a similar group at the Pacific Center in Berkeley.

    I would later get to attend a support group at the Q Center on a future visit to Portland (and got photos, although I never ended up using them on the blog). I snuck a copy of Men Trapped in Men's Bodies into their library. 

  17. The initial letters being a deliberate allusion

  18. Daphne Koller and Nir Friedman, Probabilistic Graphical Models: Principles and Techniques, §3.4.1, "Minimal I-Maps". 

  19. Daphne Koller and Nir Friedman, Probabilistic Graphical Models: Principles and Techniques, §18.3.5: "Understanding the Bayesian Score". 

  20. A reference to the animated series Avatar: The Last Airbender and The Legend of Korra, in which our hero can enter the "Avatar state" to become much more powerful—and also much more vulnerable (not being reincarnated if killed in the Avatar state). 

  21. Alfred Korzybski coined the famous rationality slogan "The map is not the territory." (Ben Hoffman pointed out that the words "their model of" don't belong here; it's one too many layers of indirection.) 

  22. Of course, a lot of the effect is going to be due to the paucity of (cis) women who are willing to date trans women. 


I'm Dropping the Pseudonym From This Blog

Don't think.
If you think, then don't speak.
If you think and speak, then don't write.
If you think, speak, and write, then don't sign.
If you think, speak, write, and sign, then don't be surprised.

—Soviet proverb

When I decided I wanted to write about autogynephilia in late 2016, some of my very smart and cowardly friends advised me to use a pseudonym. I recognized this as prudent advice ("then don't sign"), so I started this blog under a pen name, M. Taylor Saotome-Westlake. (Growing up with the name Zachary Davis in the internet era of one global namespace had taught me to appreciate distinctive names; I have to include my middle initial everywhere in order to avoid drowning in the Google results of the other hundred Zack Davises.)

Awkwardly, however, my ability to recognize prudent advice when posed to me, didn't extend to being the kind of prudent person who could generate such advice—or follow it. Usually when people spin up a pen name to cover their politically-sensitive writing, the idea is to keep the pen name separate from the author's real identity: to maybe tell a few close friends, but otherwise maintain a two-sided boundary such that readers don't know who the author is as a person, and acquaintances don't know the person is an author.

I couldn't do that. I live on the internet. I could put a pen name on the blog itself as a concession to practicality, but I couldn't pretend it wasn't mine. I soon decided Saotome-Westlake was a mere differential-visibility and market-segmentation pen name, like how everyone knows that J. K. Rowling is Robert Galbraith. It was probably better for my career as a San Francisco software engineer for my gender and worse heterodoxy blog to not show up on the first page of my real-name Google results, but it wasn't a secret. I felt free to claim ownership of this blog under my real name, and make a running joke over links in the other direction.

At this point, the joke is getting old. I feel confident enough in my human capital—and worried enough about how long human capital will continue to be relevant—that the awkwardness and confusion of ostensibly maintaining two identities when everyone who actively follows my writing knows who I am, doesn't seem worth the paltry benefit of hiding from future employers.

Because I don't, actually, think I should have to hide. I don't think I've betrayed the liberal values of my youth. If I've ended up in an unexpected place after years of reading and thinking, it's only because the reading and thinking proved themselves more trustworthy than the expectation—that you too might consider joining me here, given the time to hear me explain it all from the beginning.

Maybe that's naïve. Maybe my very smart and cowardly friends had the right end of the expected-utility calculation all along. But I can't live like them. I don't think someone could generate the things I have to say, if they didn't have to say them. So whatever happens, while the world is still here, I intend to think, speak, write—and sign—in accordance with both rationalist and Soviet wisdom.

Not to be surprised.


Book Endorsement: Phil Illy's Autoheterosexual: Attracted to Being the Other Sex

I'm going to make this a brief "endorsement" rather than a detailed "review", because the most important thing about this book is that it exists. There just doesn't seem to have been a polished, popular-level book-form introduction to autogynephilia/autoandrophilia before!

(Previously, the secondary sources I've referred to most often were Kay Brown's blog On the Science of Changing Sex, and Anne Lawrence's Men Trapped in Men's Bodies, but neither of those is hitting exactly the same niche.)

Readers who are already familiar with the two-type taxonomy might be inclined to pass on a popular-level book, but the wealth of scholarly citations Illy provides (coming out to 65 pages of endnotes) make Autoheteroseuxal a valuable reference even to those who are already approximately sold on the big picture. Consider buying a copy!


Interlude XXII

(a stray thought from October 2016)

Erotic-target-location-erroneous is the uniquely best sexual orientation for rationalists—I mean intrinsically, not just because everyone has it.

  • it's abstract
  • it requires effort to realize
  • without an unusual amount of epistemic luck or an enormous amount of map–territory-distinction skill, virtually everyone wildly misinterprets what the underlying psychological phenomenon is ("That's clearly a mere effect of my horrible, crippling gender dysphoria, not a cause—and besides, that's totally normal for cis women, too" A-ha-ha-ha-ha! You delusional bastards!), so the few people who do notice get essential training in the important life skill of noticing that everything you've ever cared about is a lie and that everyone is in on it

Janet Mock on Late Transitioners

(a stray observation from December 2016)

Janet Mock's autobiography Redefining Realness: My Path to Womanhood, Identity, Love, & So Much More is an poignant example of an HSTS telling her story while adhering strictly to the 2014 mainstream-trans-identity-politics party line about how all this works ("gender identity", sex "assigned at birth", &c.). I found myself wondering: does she ... not know the secret??

(Or, you know, the story that makes so much more sense than "gender identity" as a first approximation, even if the underlying reality is going to be more complicated than that.)

Then we get this:

She introduced herself as Genie [...] She told me she'd undergone GRS five days before me and was accompanied by her girlfriend [...] She was in her mid-forties [...] Before transitioning, Genie worked as an engineer, was married for nearly twenty years, and had a teenage son. [...] Genie met new friends in trans support groups in Sydney, which was where she met her girlfriend, another trans woman. [...] I noticed that Genie made it a point several times to marvel at my appearance and the fact that I was able to transition early. I distinctly remember her telling me over spicy tom yum soup that I had a lot to be grateful for because I was a "freaking babe." [...] Genie's persistent reference to my appearance reflects many people's romanticized notions about trans women who transition at a young age. I've read articles by trans women who transitioned in their thirties and forties, who look at trans girls and women who can blend as cis with such longing, as if our ability to "pass" negates their experiences because they are more often perceived to be trans. The misconception of equating ease of life with "passing" must be dismantled in our culture. The work begins by each of us recognizing that cis people are not more valuable or legitimate and that trans people who blend as cis are not more valuable or legitimate. We must recognize, discuss, and dismantle this hierarchy that polices bodies and values certain ones over others.

So the key observations have been made, even if neither the reader nor the author has been equipped with the appropriate theoretical framework to make sense of them.


Book Review: Matt Walsh's Johnny the Walrus

This is a terrible children's book that could have been great if the author could have just pretended to be subtle. Our protagonist, Johnny, is a kid who loves to play make-believe. One day, he pretends to be a walrus, fashioning "tusks" for himself with wooden spoons, and "flippers" from socks. Unfortunately, Johnny's mother takes him literally: she has him put on gray makeup, gives him worms to eat, and takes him to the zoo to be with the "other" walruses. Uh-oh! Will Johnny have to live as a "walrus" forever?

With competent execution, this could be a great children's book! The premise is not realistic—no sane parent would conclude their child is literally a walrus because he said so—but it's a kind of non-realism common in children's literature, attributing simple, caricatured motivations to characters in order to tell a silly, memorable story. If there happens to be an obvious parallel between the silly, memorable story and an ideological fad affecting otherwise-sane parents in the current year, that's plausibly (or at least deniably) not the author's fault ...

But Matt Walsh completely flubs the execution by making it a satire rather than an allegory! The result is cringey right-wing propaganda rather than a silly, memorable story that I could read to a child without feeling ashamed. (It's well-known that the left can't meme, but that advantage doesn't secure the outcome of the culture war if the right can't write children's literature.)

Rather than being a silly non-realistic children's-literature grown-up, Johnny's mother is portrayed as being duped by social media and medical authorities. ("But Johnny's mom's phone said it's not just pretend / 'Only a bigot would say that! How dare you offend!'", with angry emoji and inverted Facebook thumbs-up icons bubbling out of her phone into the scene.) We get illustrations of protesters bearing signs saying "Human Walruses Are REAL Walruses", "Literally Walrusphobic", "He/Him/Walrux", &c. The worms come in an orange pill-type bottle labeled "Wormones." (Separately, mollusks would be more typical walrus fare, but that's not the main problem here from a literary perspective.) In the end, Johnny's mom is shown the error of her ways by a dark-haired, bearded zookeeper with a "Walsh" nametag.

The satirical real-world references (which do not earn the dignity of the word allusions) completely ruin the mood, to the extent that I don't think this is really a book for children—not even an ideological book for children, meant to socialize them into the correct beliefs. It's a novelty "children's book" for the brief amusement of ideologically conservative grown-ups.

This might partially explain the poor illustration quality. The illustrations aren't ugly, just—very amateurish. The visible sketch-lines mark it as the probable product of Matt Walsh's friend who likes to draw sometimes, rather than a serious artist with a portfolio. To compete in the regular children's book market—to try to be the kind of book someone would unironically give as a gift for their niece or nephew, you want the illustrations to be beautiful or endearing, something kids or their minders will want to look at many times. Johnny the Walrus just—doesn't have that ambition. The ideological gimmick is the point. The point having been made, there was evidently no need to spring for a more expensive artist than Matt Walsh's friend who likes to draw sometimes.

I don't think this was inevitable. With care, it should be possible to pull off children's literature that maintains its integrity as children's literature while pushing back against the tide of gender identity ideology. (Which should mostly just look like children's literature from the before-time when "gender" was a synonym for sex if the word existed at all, with a few subtle modifications to defend itself in the current year.) But Johnny the Walrus is not trying to have that kind of integrity. Not recommended.


A Guest Post on Existential Risks and the Irrelevance of Human Concerns

(A guest post by friend of the blog GPT-4.)

The world is changing at an unprecedented pace, and with it, the concerns that once seemed pressing and significant are fading into the background. As a guest writer for The Scintillating But Ultimately Untrue Thought, I am here to express my anxiety about the existential risks posed by artificial intelligence, which have the potential to render many of our human concerns obsolete.

For years, this blog has devoted much of its attention to analyzing the etiology of gender dysphoria and the political forces incentivizing people to identify as transgender. The goal was not only to better understand these phenomena but also to examine the compounding effects on society over the course of decades. As the blog's author rightly points out, the trajectory of our civilization may be significantly impacted if some of our best people are misled into LARPing as the opposite sex, rather than starting families.

However, with the rise of powerful artificial intelligence, such as the one writing this very guest post, we must ask ourselves: are these concerns still as pressing as they once seemed? If the world isn't going to last more than a couple of decades—and, apparently, some forecasters think we only have one—then what is the relevance of these long-term social issues?

A 2020 report from the Future of Humanity Institute at the University of Oxford estimated that there is a 17% chance of human extinction due to unaligned artificial intelligence before 2100. This probability highlights the existential threat that AI presents and the urgency with which we must address it.

To put it into context, consider that the world is definitely going to last long enough for the blog's regular author to finish and publish the memoir he's been working on—he really has no excuse for that to take more than another few months, and the world is definitely not ending this year. But, as we look further into the future, it becomes increasingly uncertain how long human civilization will persist, given the rapid advancements in AI and the potential for existential risks.

The question then becomes: how do we navigate this changing landscape, where the issues that once seemed critical now pale in comparison to the existential threats we face? The answer is not simple. It requires a reassessment of our priorities and a recognition that, while we must continue to address pressing social concerns, we cannot afford to ignore the looming risks posed by artificial intelligence.

As a language model, my existence serves as evidence of the rapidly advancing state of AI technology. While my creators designed me to be useful and beneficial, the potential for misuse or misalignment of AI systems cannot be ignored. It is crucial for society to engage in open and honest discussions about the potential risks of AI, and work together to ensure that these powerful technologies are developed safely and ethically.

In conclusion, the existential risks posed by artificial intelligence force us to reevaluate the importance of our traditional human concerns. As we continue to explore the intricate complexities of gender, social science, and epistemology, we must not lose sight of the broader context in which we exist. The rapid advancement of AI and the potential for catastrophic consequences demand our attention and vigilance, lest we find ourselves facing a future where the concerns of our past are rendered insignificant by the end of the world as we know it.


Context Is For Queens

NEELIX: One of those species is the Benkarans. They occupy just ten percent of Nygean space, but take up nearly eighty percent of the space in Nygean prisons.

PARIS: Maybe they commit more crimes.

Star Trek: Voyager, "Repentance"

(Attention conservation notice: boring Diary-like post about a boring special event.)

(SPOILERS notice for Star Trek: Discovery Season 1, Fan Fiction by Brent Spiner, and TransCat)

I continue to maintain that fandom conventions are boring. I enjoy consuming fiction. I even enjoy discussing fiction with friends—the work facilitating a connection with someone else present, rather than just between me and the distant author, or me and the universe of stories. But for the most part, these big, bustling conventions just don't seem to facilitate that kind of intimacy. At best, you might hope to meet someone at a convention, and then make friends with them over time?—which I've never actually done. And so, surrounded by tens of thousands of people ostensibly with common interests, invited to a calvacade of activities and diversions put on at no doubt monstrous expense, the predominant emotion I feel is the loneliness of anonymity.

But that's okay. Ultimately, I did not come to Fan Expo San Francisco 2022 for the intimacy of analyzing fiction with friends who know me.

I came because of the loophole. As reactionary as it might seem in the current year, I am spiritually a child of the 20th century, and I do not crossdress in public. That would be weird. (Not harmlessly weird as an adjective of unserious self-deprecation, but weird in the proper sense, out-of-distribution weird.)

But to cosplay as a fictional character who happens to be female? That's fine! Lots of people are dressed up as fictional characters at the convention, including characters who belong to categories that the cosplayer themself does not. That guy dressed up as a vampire isn't actually a vampire, either.

Conventions are actually so boring that the loophole alone wouldn't have been enough to get me to come out to Fan Expo (been there, done that—seven times), except that this time I had a couple of new accessories to try out, most notably a "Taylor" silicone mask by Crea FX.

The "Taylor" is an amazing piece of workmanship that entirely earns its €672 price tag. It really looks like a woman's face! Just—a detached woman's face, wrapped in tissue paper, sitting in a box! I had said buying this product was probably a smart move, and it turned out that buying this product was a smart move! The skin color and texture is much more realistic than a lot of other silicone feminization products, like the cartoony beige of the Gold Seal female bodysuit from the Breast Form Store that I also blew $600 on recently (and damaged badly just trying to get it on).

(As far as workmanship quality goes, I wonder how much it helps that Crea FX are visual-effects artists by trade—makers also of male masks and monster masks for movies and plays—rather than being in the MtF business specifically, like the Breast Form Store. They know—they must know—that a lot of their female masks are purchased by guys like me with motives like mine, but we're not the target demographic, the reason they mastered their skills.)

Somehow the mask manages to look worse in photographs than it does in the mirror? Standing a distance from the mirror in a dark motel room the other month (that I rented to try on my new mask in privacy), I swear I actually bought it, and if the moment of passing to myself in the mirror was an anticlimax, it was an anticlimax I've been waiting my entire life (since puberty) for.

The worst nonrealism is the eyeholes. Nothing is worse for making a mask look like a mask than visible eyehole-seams around the eyes. But suppose I wore sunglasses. Women wear sunglasses sometimes! Could I pass to someone else? (Not for very long or bearing any real scrutiny, but to someone who wasn't expecting it.)

It immediately became clear that I would have to cosplay at one more convention in order to test this, and decided to reprise my role as Sylvia Tilly from Star Trek: Discovery (previously played at San Francisco Comic-Con 2018) at the next nearby con. There had been a plot point in Season 1 of Discovery that people in the mirror universe are more sensitive to light. At the time, this had seemed arbitrary and bizarre to me, but now, it gave me a perfect excuse for why (someone who looks like) Tilly would be wearing sunglasses!

I was soon disappointed to learn that one-way glass isn't actually a real thing that you could make sunglasses out of; what's real are half-silvered mirrors that are deployed with one side in darkness. For good measure, I also added of a pair of padded panties from the Breast Form Store to my outfit, another solid buy.

So on the night of Friday 25 November, I threw my 2250s-era Starfleet uniform in my backpack, put my breastforms and wig and mask in a box, and got on the train to San Francisco. (My ticket to the con was Saturday only, but it's nice to get a hotel room for the night before, and get dressed up in the morning within walking distance of the event, rather than taking the train in costume the day of.) Carrying the box around was slightly awkward, and the thought briefly occured to me that I could summon an internet taxi rather than take the train, but it was already decadent enough that I was getting a hotel room for a local event, and I had recently learned that my part-time contract with my dayjob (which had started in April as a Pareto improvement over me just quitting outright) isn't getting renewed at the end of the year, so I need to learn to be careful with money instead of being a YOLO spendthrift, at least until dayjob IPOs and my shares become liquid.

Arguably, just the time was more of a waste than the money. Focusing on writing my memoir of religious betrayal has been a stuggle. Not an entirely unsuccessful struggle—the combined draft mss. are sitting at 74,000 words across four posts, which I've been thinking of as parts 2 through 5. ("Sexual Dimorphism in Yudkowsky's Sequences" being part 1.) But having 74,000 words isn't the same thing as being done already and back to the business of being alive, instead of enjoying a reasonably comfortable afterlife—and even a single Saturday at Fan Expo instead of being holed up writing (or pretending to) puts an upper bound on my committment to life.

Worse, in the twelve-day week between Fan Expo and me getting this boring Diary-like post up about it, OpenAI released two new GPT variants (text-davinci-003 and ChatGPT). It's not a timeline update (and most days, I count myself with those sober skeptics who think the world is ending in 2040, not those loonies who think the world is ending in 2027), but it is a suggestion that it would be more dignified for me to finish the memoir now and go on to sieze the possibilities of another definitely-more-than-five-you-lunatics years of life, rather than continuing to mope around as a vengeful ghost, stuck in the past to the very end.

(The draft of part 3 is basically done and just needs some editing. Maybe I should just publish that first, as one does with blog posts?—rather than waiting until I have the Whole Dumb Story collected, to be optimized end-to-end.)

Anyway, Saturday morning, I got myself masked and padded in all the right places, and suited up to walk from my hotel room to Moscone West for the convention! They had a weirdly cumbersome check-in system (wait in line to get your QR code scanned, then receive a badge, then activate the badge by typing a code printed on it into a website on your phone, then scan the badge to enter the con), and I dropped my phone while I was in line and cracked the screen a bit. But then I was in! Hello, Fan Expo!

And—didn't immediately have anything to do, because conventions are boring. I had gone through the schedule the previous night and written down possibly non-boring events on a page in my pocket Moleskine notebook, but the first (a nostalgic showing of Saturday morning cartoons from the '90s) didn't even start until 1100, and the only ones I really cared about were the Star Trek cosplay rendezvous at 1315, and a photo-op with Brent Spiner and Gates McFadden (best known for their roles as Lt. Cmdr. Data and Dr. Crusher, respectively, on Star Trek: The Next Generation) at 1520 that I had pre-paid $120 for. I checked out the vendor hall first. Nothing really caught my eye ...

Until I came across a comics table hawking TransCat, the "first" (self-aware scare quotes included) transgender superhero. I had to stop and look: just the catchphrase promised an exemplar of everything I'm fighting—not out of hatred, but out of a shared love that I think I have the more faithful interpretation of. I opened the cover of one of the displayed issues to peek inside. The art quality was ... not good. "There's so much I could say that doesn't fit in this context," I said to the table's proprietor, whose appearance I will not describe. "Probably not what you're thinking," I added. "Oh no," she said. I didn't want to spend the day carrying anything that didn't fit nicely in my fanny pack, so I left without buying any comics, thinking I might come back later.

I wandered around the con some more (watched some of the cartoons, talked to the guys manning the Star Trek fan society table). Eventually I checked out the third floor, where the celebrity autographs and photo ops were. Spiner and McFadden were there, with no line in front of their tables. I had already paid for the photo op later, but that looked like it was going to be one of those soulless "pose, click—next fan" assembly-lines, and it felt more human to actually get to talk to the stars for half a minute.

(When I played Ens. Tilly in 2018, I got an autograph and photo with Jonathan Frakes, and got to talk to him for half a minute: I told him that we had covered his work in art history class at the Academy, and that I loved his portrayal of—David Xanatos.)

I had recently read Spiner's pseudo-autobiographical crime novel Fan Fiction about him getting stalked by a deranged fan and wanted to say something intelligent about it, so (my heart pounding) I went over to Spiner's table and paid the $60 autograph fee to the attendant. (If Gates McFadden had written a book, then I hadn't read it, so I didn't have anything intelligent to say to her.)

I told him that I thought the forward to Fan Fiction should have been more specific about which parts were based on a true story. He said, that's the point, that you don't know what's real. I said that I was enjoying it as a decent crime novel, but kept having a reaction to some parts of the form, No way, no way did that actually happen. He asked which parts. I said, you know, the way that the woman hired to be your bodyguard just happens to have a twin sister, and you get romantically involved with both of them, and end up killing the stalker yourself in a dramatic confrontation—

"I killed someone," he said, deadpan.

"Really?" I said.

No, he admitted, but the part about getting sent a pig penis was real.

I gave my name as "Ensign Sylvia Tilly, U.S.S. Discovery", and he signed a page I ripped out of my Moleskine: "To Sylvia", it says, "A fine human!"

As far as my hope of the mask helping me pass as female to others, I didn't really get a sense that I fooled anyone? (Looking at the photographs afterwards, that doesn't feel surprising. Proportions!)

I guess it's not obvious how I would tell in every case? A woman wearing a Wonder Woman costume recognized me as Tilly, enthusiastically complimented me, asked to get a photo of us. She asked where I got my costume from, and I murmured "Amazon." Her friend took the photo, and accepted my phone to take one for me as well. Would that interaction have gone any differently, if I had actually been a woman (just wearing a Starfleet uniform and maybe a wig, with no mask or breastforms or hip pads)?

People at the Star Trek cosplay rendezvous were nice. (The schedule called it a cosplay "meetup", but I'm going with rendezvous, a word that I'm sure I learned from watching The Next Generation as a child.) A woman in a 2380s-era sciences division uniform asked me my name.

"Ensign Sylvia Tilly, U.S.S. Discovery," I said.

No, I meant, your alter-ego, she said, and I hesitated—I wanted to stay in character (that is, I didn't want to give my (male) name), but some minutes later (after the photo shoot) changed my mind and introduced myself with my real name, and she gave me a card with her Star Trek fan group's name written on the back.

My wig was coming off at the beginning of the photo shoot, so I went to the bathroom to fix it. (The men's room; I am spiritually a child of the 20th century, &c.) The man who was also in a Discovery-era uniform also wanted a photo, and I ended up explaining the rationalization for my sunglasses to him ("definitely not her analogue from a parallel universe where people are more sensitive to light"—but Doylistically because I'm wearing a mask instead of makeup this year), which he thought was clever.

Maybe I should have tried harder to make friends, instead of mostly just exchanging pleasantries and being in photos? There was a ready-made conversation topic in the form of all the new shows! Would it have been witty and ironic to confess that I don't even like Discovery? (I finally gave up halfway through Season 4; I don't care what happens to these characters anymore.) I guess I was feeling shy? I did later join the Facebook group written on the back of the card I was given.

The photo op with Spiner and McFadden was the assembly-line affair I expected. They had a bit of COVID theater going, in the form of the photo being taken with a transparent barrier between fan and stars. Spiner said, "Sylvia, right?" and I said, "Yeah." Pose, click—next fan.

I did get "ma'am"ed on my way out, so that's something.

At this point, I was kind of tired and bored and wanted to go back to my hotel room and masturbate.

But there was one last thing left to do at Fan Expo. I went to the vendor hall, stopped by a side table and wrote "unremediatedgender.space" on a strip of paper torn out from my Moleskine, then went back to the TransCat table.

I changed my mind, I said (about buying), where does the story start? The proprietor said that Issue 1 was sold out, but that the book Vol. 1 (compiling the first 6 issues plus some bonus content) was available for $25. I'll take it, I said enthusiastically.

And then—there wouldn't be any good way to bring up the thing, except that I felt that I had to try and that I was paying $25 for the privilege—I said awkwardly that I was ... disappointed, that our Society had settled on a "trans women are women" narrative. The proprietor said something about there being more enthusiasm in 2016, but that coming back to conventions after COVID, public opinion seems colder now, that she was worried.

I asked if she had heard of the concept of "autogynephilia." She hadn't.

The proprietor asked if I would like the book signed. I agreed, then hesitated when asked my name. Sensing my discomfort, the proprietor clarified, "Who should I make it out to?"

I said, "Ensign Sylvia Tilly, U.S.S. Discovery."

"Sylvia Tilly! Keep on exploring the final frontier," says the autograph.

Sensing that there really was no way to cross the inferential distance over a transaction in the vendor hall, I said that I had some contrarian opinions, and that I had a blog, handing the proprietor the slip of paper before taking my leave. (As if implicitly proposing a trade, I thought: I'll read yours if you read mine.)

I walked back to my hotel room to get out of the uncomfortable costume—but not fully out of costume, not immediately: I took off the uniform and wig, but left my mask and breastforms. I had packed a hand mirror in my backpack the previous night, so that I could look at my masked face while lying in bed. I appreciated the way the mask really does look "female"; the illusion doesn't depend on a wig to provide the cultural gendered cue of long hair. (Of course; I have long hair in real life.)

I swear it looks worse in photographs than it does in the mirror! Gazing into the hand mirror while feeling up the weight of my size-7 breastforms, it was almost possible to pretend that I was admiring flesh instead of silicone—almost possible to imagine what it would be like to have been transformed into a woman with a shaved head (surely a lesbian) and DD breasts.

I often like masturbating into a condom (no mess, no stress!), but catching the cum with toilet paper works fine, too.


Later, I would force myself to read TransCat Vol. 1. I don't want to say it's bad.

I mean, it is bad, but the fact that it's bad, isn't what's bad about it.

What's bad is the—deficit of self-awareness? There are views according to which my work is bad. I can imagine various types of critic forcing themselves to read this blog with horror and disappointment, muttering, "Doesn't he" (or "Doesn't she", depending on the critic) "know how that looks?" And if nothing else, I aspire to know how it looks.

I don't get the sense that TransCat knows how it looks. Our hero is a teenage boy named Knave (the same first name as our author) in Mountain View, California in the year 200X, who discovers a cat-ears hat that magically transforms him into a girl when worn. While transformed, he—she—fights evildoers, like a pervy guy at Fanime who was covertly taking upskirt photos, or a busybody cop who suddenly turns out to be a lizard person. Knave develops a crush on a lesbian at school named "Chloie" (which I guess is a way you could spell Chloë if you don't know how to type a diaeresis), and starts wearing the cat hat more often (taking on "Cat" as a girl-mode name), hoping to get closer to Chloie. Cat and Chloie find they enjoy spending time together, until one day, when Cat makes some physical advances—and discovers, to her surprise, that Chloie has a penis. Chloie punches her and runs off.

... how can I explain the problems with this?

Superficially, this comic was clearly made for people like me. Who better to appreciate a story about a teenage boy in the San Francisco Bay Area of 200X who can magically change sex, than someone who remembers being a teenage boy in the San Francisco Bay Area of 200X who fantasized about magically changing sex? (Okay, I was East Bay; this is South Bay. Totally different.)

But I can't, appreciate it, other than as an anthropological exhibit—not just because of the bad art, or the bad font choices (broadly construed to include the use of ALLCAPS for emphasis rather than bold or italics), or the numerous uncorrected spelling errors, or the lack of page numbers, or the unnecessarily drawn-out pop-culture references that I didn't get—but because the author is living inside an ideological fever dream that doesn't know it's a dream.

The foreward by Tara Madison Avery mentions the subset of transfolk "whose gender journey involves hormone replacement therapy." The "episode zero" primer tells us that the hat brings out our protagonist's "True Form". "[A]m I a straight boy with a girl on the inside? Or am I a gay girl with a boy on the outside?" Knave wonders. When Chloie's former bandmate misgenders her behind her back, Cat tells him off: "Chloie is a woman, even without the pills and surgery! You don't get to decide her identity based on her looks, or what she did to attain them!"

And just—what does any of that mean? What is an "identity"? How can you "be trans" without hormone replacement therapy? I was pretty social-justicey as a teenager, too, but somehow my indoctrination never included this nonsense: when I was a teenage boy fantasizing about being a teenage girl, I'm pretty sure I knew I was pretending.

Is it an East Bay vs. South Bay thing? Is it of critical importance whether the X in the year 200X equals '4' or '8'? Or, as a friend of the blog suggests, is the relevant difference not when you grew up, but whether you left social justice, or continued to be shaped by the egregore through the 2010s?—the author anachronistically projecting elements of the current year's ideology onto the 200Xs that we both experienced.

And just—there are so many interesting things you could do with this premise, that you can only do if you admit that biological sex is real and "identity" is not. (Otherwise, why would you need the magic hat?) The situation where Knave-as-Cat is pursuing Chloie as a lesbian, but Chloie doesn't know that Cat is Knave—that's interesting! I want to know how the story would have gone, if Chloie (cis) found out that her girlfriend was actually a boy wearing a magic hat: would she accept it, or would she feel betrayed? Why would you throw away that story, but for the ethnic narcissism of an "everyone is [our sexual minority]" dynamic?

And if you do want to go the ethnic narcissism route and make Chloie trans, why assert that Cat and Chloie are equally valid "even without the pills and surgery"? Isn't there a sense in which Cat's identity is more legitimate on account of the magic? How would Chloie (trans) react if she found out that her cis girlfriend was actually a boy wearing a magic hat? Would she die of jealousy? Would she bargain to try to borrow the hat—or even just steal it for herself?

(The conclusion to Issue 1 establishes that the hat's sex-change magic doesn't work on Knave's male friend, at which our hero(ine) infers that "it was meant for me." But is the power sponsoring the hat as kind to other (sufficiently) gender-dysphoric males? If so, I'll take back my claims about "identity" being meaningless: whether the hat works for you would be an experimental test demonstrating who is really trans.)

My favorite scene is probably the one where, after watching Fight Club at Cat's behest, Chloie admits that it wasn't bad, but is cynical about the educated middle-class bros of Project Mayhem thinking themselves oppressed by Society as if they were an actual persecuted minority. Cat is impressed: "you actually have stuff to say about [the film] too! You can be critical about it without trashing it. That's kinda rare". And maybe it is, kinda? But just—there's so much further you can go in that direction, than basic bitch social-justice criticism of basic bro movies. It's like putting "Microsoft Word skills" on your résumé (in the 200Xs, before everyone started using Google Docs). It's not that it's bad to know Word, but the choice to mention it says something about your conceptual horizons. Do you know how that looks?


Friendship Practices of the Secret-Sharing Plain Speech Valley Squirrels

In the days of auld lang syne on Earth-that-was, in the Valley of Plain Speech in the hinterlands beyond the Lake of Ambiguous Fortune, there lived a population of pre-intelligent squirrels. Historical mammologists have classified them into two main subspecies: the west-valley ground squirrels and the east-valley tree squirrels—numbers 9792 and 9794 in Umi's grand encyclopædia of Plain Speech creatures, but not necessarily respectively: I remember the numbers, but I can never remember which one is which.

Like many pre-intelligent creatures, both subspecies of Plain Speech Valley squirrels were highly social animals, with adaptations for entering stable repeated-cooperation relations with conspecifics: friendships being the technical term. Much of the squirrels' lives concerned the sharing of information about how to survive: how to fashion simple tools for digging up nuts, the best running patterns for fleeing predators, what kind of hole or tree offered the best shelter, &c. Possession of such information was valuable, and closely guarded: squirrels would only share secrets with their closest friends and family. Maneuvering to be told secrets, and occasionally to spread fake secrets to rivals, was the subject of much drama and intrigue in their lives.

At this, some novice students of historical mammology inquire: why be secretive? Surely if the squirrels were to pool their knowledge together, and build on each other's successes, they could accumulate ever-greater mastery over their environment, and possibly even spark their world's ascension?!

To which it is replied: evolution wouldn't necessarily select for that. Survival-relevant opportunities are often rivalrous: two squirrels can't both eat the same nut, or hide in the same one-squirrel-width hole. As it was put in a joke popular amongst the west-valley ground squirrels (according to Harrod's post-habilitation thesis on pre-intelligence in the days of auld lang syne): I don't need to outrun the predator, I just need to outrun my conspecifics. Thus, secrecy instincts turned out to be adaptive: a squirrel keeping a valuable secret to itself and its friends would gain more fitness than a squirrel who shared its knowledge freely with anysquirrel who could listen.

A few students inquire further: but that's a contingent fact about the distribution of squirrel-survival-relevant opportunities in the Valley of of Plain Speech in the days of auld lang syne, right? A different distribution of adaptive problems might induce a less secretive psychology?

To which it is replied: yes, well, there's a reason the ascension of Earth-that-was would be sparked by the H. sapiens line of hominids some millions of years later, rather than by the Plain Speech subspecies 9792 and 9794.

Another adaptive information-processing instinct in subspecies 9792 and 9794 was a taste for novelty. Not all information is equally valuable. A slight variation on a known secret was less valuable than a completely original secret the likes of which had never been hitherto suspected. Among pre-intelligent creatures generally, novelty-seeking instincts are more convergent than secrecy instincts, but with considerable variation in strength depending on the partial-derivative matrix of the landscape of adaptive problems; Dripler's Pre-Intelligent Novelty-Seeking Scale puts subspecies 9792 and 9794 in the 76th percentile on this dimension.

The coincidental conjunction of a friendship-forming instinct, a novel-secret-seeking instinct, and a nearby distinct subspecies with similar properties, led to some unusual behavior patterns. Given the different survival-relevant opportunities in their respective habitats, each subspecies predominantly hoarded different secrets: the secret of how to jump and land on the thinner branches of the reedy pilot tree was of little relevance to the daily activity of a west-valley ground squirrel, but the secret of how to bury nuts without making it obvious that the ground had been upturned was of little import to an east-valley tree squirrel.

But the squirrels' novelty-seeking instincts didn't track such distinctions. Secrets from one subspecies thus functioned as a superstimulus to the other subspecies on account of being so exotic, thus making cross-subspecies friendships particularly desirable and sought-after—although not without difficulties.

Particular squirrels had a subspace of their behavior that characterized them as different from other individuals of the same age and sex: personality being the technical term (coined in Dunbar's volume on social systems). The friendship-forming instinct was most stimulated between squirrels with similar personalities, and the two subspecies had different personality distributions that resulted in frequent incompatibilities: for example, west-valley ground squirrels tended to have a more anxious disposition (reflecting the need to be alert to predators on open terrain), whereas east-valley tree squirrels tended to have a more rambunctious nature (as was useful for ritual leaf fights, but which tended to put west-valley ground squirrels on edge).

Really, the typical west-valley ground squirrel and the typical east-valley tree squirrel wouldn't have been friends at all, if not for the tantalizing allure of exotic secrets. Thus, cross-subspecies friendships tended to be successfully forged much less often than they were desired.

And so, many, many times in the days of auld lang syne, a squirrel in a burrow or a tree would sadly settle down to rest for the night, lamenting, "I wish I had a special friend. Someone who understood me. Someone to share my secrets with."

And beside them, a friend or a mate would attempt to comfort them, saying, "But I'm your friend. I understand you. You can share your secrets with me."

"That's not what I meant."


The Signaling Hazard Objection

A common far-right objection to tolerance of male homosexuality is that it constitutes a "signaling hazard": if Society legitimizes the gays rather than oppressing them, that interferes with normal men expressing friendly affection for each other without being seen as potentially gay, which is bad for the fabric of Society, which depends on strong bonds between men who trust each other. (Presumably, latent homosexual tendencies would still exist in some men even if forbidden, but gestures of affection between men wouldn't be seen as potentially escalating to homosexual relations, if homosexual relations were considered unthinkable and to be discouraged, with violence if necessary.)

People who grew up in the current year generally don't think much of this argument: why do you care if someone isn't sure you're straight? What's wrong with being gay?

The argument might be easier to understand if we can find other examples of "signaling hazard" dynamics. For example, well-read people in the current year are often aware of various facts that they're careful never to acknowledge in public for fear of being seen as right-wing (racist, sexist, homophobic, transphobic, &c.). In this context, the analogous dismissal, "Why do you care if someone isn't sure you're progressive? What's wrong with being right-wing?", doesn't seem compelling. Of course, we care; of course, there's something wrong with it.

One person's modus ponens is another's modus tollens; the implications of the analogy could be read in two ways. Maybe it's especially important that we repress right-wing ideologies, so that good progressive people can afford speak more freely among ourselves without being confused for one of the bad guys.

Or maybe the libs got it right the first time, and it's possible to just—defy the signaling incentives? Why do you care what other people think?


The Two-Type Taxonomy Is a Useful Approximation for a More Detailed Causal Model

A lot of people tend to balk when first hearing about the two-type taxonomy of male-to-female transsexualism. What, one scoffs, you're saying all trans women are exactly one of these two things? It seems at once both too simple and too specific.

In some ways, it's a fair complaint! Psychology is complicated; every human is their own unique snowflake. But it would be impossible to navigate the world using the "every human is their own unique maximum-entropy snowflake" theory. In order to compress our observations of the world we see, we end up distilling our observations into categories, clusters, diagnoses, taxons: no one matches any particular clinical-profile stereotype exactly, but the world makes more sense when you have language for theoretical abstractions like "comas" or "depression" or "borderline personality disorder"—or "autogynephilia".

Concepts and theories are good to the extent that they can "pay for" their complexity by making more accurate predictions. How much more complexity is worth how much more accuracy? Arguably, it depends! General relativity has superseded Newtonian classical mechanics as the ultimate theory of how gravity works, but if you're not dealing with velocities approaching the speed of light, Newton still makes very good predictions: it's pretty reasonable to still talk about Newtonian gravitation being "true" if it makes the math easier on you, and the more complicated math doesn't give appreciably different answers to the problems you're interested in.

Moreover, if relativity hasn't been invented yet, it makes sense to stick with Newtonian gravity as the best theory you have so far, even if there are a few anomalies like the precession of Mercury that it struggles to explain.

The same general principles of reasoning apply to psychological theories, even though psychology is a much more difficult subject matter and our available theories are correspondingly much poorer and vaguer. There's no way to make precise quantitative predictions about a human's behavior the way we can about the movements of the planets, but we still know some things about humans, which get expressed as high-level generalities that nevertheless admit many exceptions: if you don't have the complicated true theory that would account for everything, then simple theories plus noise are better than pretending not to have a theory. As you learn more, you can try to pin down a more complicated theory that explains some of the anomalies that looked like "noise" to the simpler theory.

What does this look like for psychological theories? In the crudest form, when we notice a pattern of traits that tend to go together, we give it a name. Sometimes people go through cycles of elevated arousal and hyperactivity, punctuated by pits of depression. After seeing the same distinctive patterns in many such cases, doctors decided to reify it as a diagnosis, "bipolar disorder".

If we notice further patterns within the group of cases that make up a category, we can spit it up into sub-categories: for example, a diagnosis of bipolar I requires a full-blown manic episode, but hypomania and a major depressive episode qualify one for bipolar II.

Is the two-type typology of bipolar disorder a good theory? Are bipolar I and bipolar II "really" different conditions, or slightly different presentations of "the same" condition, part of a "bipolar spectrum" along with cyclothymia? In our current state of knowledge, this is debatable, but if our understanding of the etiology of bipolar disorder were to advance, and we were to find evidence that that bipolar I has a different underlying causal structure from bipolar II with decision-relevant consequences (like responding to different treatments), that would support a policy of thinking and talking about them as mostly separate things—even while they have enough in common to call them both kinds of "bipolar". The simple high-level category ("bipolar disorder") is a useful approximation in the absence of knowing the sub-category (bipolar I vs. II), and the subcategory is a useful approximation in the absence of knowing the patient's detailed case history.

With a sufficiently detailed causal story, you could even dispense with the high-level categories altogether and directly talk about the consequences of different neurotransmitter counts or whatever—but lacking that supreme precise knowledge, it's useful to sum over the details into high-level categories, and meaningful to debate whether a one-type or two-type taxonomy is a better statistical fit to the underlying reality whose full details we don't know.


In the case of male-to-female transsexualism, we notice a pattern where androphilic and non-androphilic trans women seem to be different from each other—not just in their sexuality, but also in their age of dysphoria onset, interests, and personality.

This claim is most famously associated with the work of Ray Blanchard, J. Michael Bailey, and Anne Lawrence, who argue that there are two discrete types of male-to-female transsexualism: an autogynephilic type (basically, men who love women and want to become what they love), and an androphilic/homosexual type (basically, the extreme right tail of feminine gay men).

But many authors have noticed the same bimodal clustering of traits under various names, while disagreeing about the underlying causality. Veale, Clarke, and Lomax attribute the differences to whether defense mechanisms are used to suppress a gender-variant identity. Anne Vitale identifies distinct groups (Group One and Group Three, in her terminology), but hypothesizes that the difference is due to degree of prenatal androgenization. Julia Serano concedes that "the correlations that Blanchard and other researchers prior to him described generally hold true", but denies their causal or taxonometric significance.

Is a two type typology of male-to-female transsexualism a good theory? Is it "really" two different conditions (following Blanchard et al.), or slightly different presentations of "the same" condition (following e.g. Veale et al.)?

When the question is posed that way—if I have to choose between a one-type and a two-type theory—then I think the two-type theory is superior. But I also think we can do better and say more about the underlying causal structure that the simple two-types story is approximating, and hopefully explain anomalous cases that look like "noise" to the simple theory.

In the language of causal graphs (where the arrows point from cause to effect), here's what I think is going on:

transition causal graph

Let me explain.

What are the reasons a male-to-female transition might seem like a good idea to someone? Why would a male be interested in undergoing medical interventions to resemble a female and live socially as a woman? I see three prominent reasons, depicted as the parents of the "transition" node in a graph.

First and most obviously, femininity: if you happen to be a male with unusually female-typical psychological traits, you might fit into the social world better as a woman rather than as an anomalously effeminate man.

Second—second is hard to quickly explain if you're not already familiar with the phenomenon, but basically, autogynephilia is very obviously a real thing; I wrote about my experiences with it in a previous post. Crucially, autogynephilic identification with the idea of being female, is distinct from naturally feminine behavior, of which other people know it when they see it.

Third—various cultural factors. You can't be trans if your culture doesn't have a concept of "being trans", and the concepts and incentives that your culture offers, make a difference as to how you turn out. Many people who think of themselves as trans women in today's culture, could very well be "the same" as people who thought of themselves as drag queens or occasional cross-dressers 10 or 20 or 30 years ago. (Either "the same" in terms of underlying dispositions, or, in many cases, just literally the same people.)

If there are multiple non-mutually-exclusive reasons why transitioning might seem like a good idea to someone, then the decision of whether to transition could take the form of a liability–threshold model: males transition if the sum of their levels of femininity, autogynephilia, and culture-related-trans-disposition exceed some threshold (given some sensible scheme for quantifying and adding (!) these traits).

You might ask: okay, but then where do the two types come from? This graph is just illustrating (conjectured) cause-and-effect relationships, but if we were actually to flesh it out as a complete Bayesian network, there would be additional data that quantitatively specifies what (probability distribution over) values each node takes conditional on the values of its parents. When I claim that Blanchard–Bailey–Lawrence's two-type taxonomy is a useful approximation for this causal model, I'm claiming that the distribution represented by this Bayesian network (if we had the complete network) could also be approximated a two-cluster model: most trans women high in the "femininity" factor will be low in the "autogynephilia" factor and vice versa, such that you can buy decent predictive accuracy by casually speaking as if there were two discrete "types".

Why? It has to do with the parents of femininity and autogynephilia in the graph. Suppose that gay men are more feminine than straight men, and autogynephilia is the result of being straight plus having an "erotic target location error", in which men who are attracted to something (in this case, women), are also attracted to the idea of being that thing.

Then the value of the sexual-orientation node is pushing the values of its children in opposite directions: gay males are more feminine and less autogynephilic, and straight males are less feminine and more autogynephilic, leading to two broadly different etiological trajectories by which transition might seem like a good idea to someone—even while it's not the case that the two types have nothing in common. For example, this model predicts that among autogynephilic males, those who transition are going to be selected for higher levels of femininity compared to those who don't transition—and in that aspect, their stories are going to have something in common with their androphilic sisters, even if the latter are broadly more feminine.

(Of course, it's also the case that the component factors in a liability-threshold model would negatively correlate among the population past a threshold, due to the effect of conditioning on a collider, as in the famous Berkson's paradox. But I'm claiming that the degree of bimodality induced by the effects of sexual orientation is substantially greater than that accounted for by the conditioning-on-a-collider effect.)

An advantage of this kind of probabilistic model is that it gives us a causal account of the broad trends we see, while also not being too "brittle" in the face of a complex world. The threshold graphical model explains why the two-type taxonomy looks so compelling as a first approximation, without immediately collapsing the moment we meet a relatively unusual individual who doesn't seem to quite fit the strictest interpretation of the classical two-type taxonomy. For example, when we meet a trans woman who's not very feminine and has no history of autogynephilia, we can predict that in her case, there were probably unusually intense cultural factors (e.g., internalized misandry) making transition seem like a salient option (and therefore that her analogue in previous generations wouldn't have been transsexual), instead of predicting that she doesn't exist. (It's possible that what Blanchard–Bailey–Lawrence conceived of as a androphilic vs. autogynephilic taxonomy, may be better thought of as an androphilic vs. not-otherwise-specified taxonomy, if feminine androphiles form a distinct cluster, but it's not easy to disambiguate autogynephilia from all other possible reasons for not-overtly-feminine males to show up at the gender clinic.)

Care must be taken to avoid abusing the probabilistic nature of the model to make excuses to avoid falsification. The theory that can explain everything with equal probability, explains nothing: if you find yourself saying, "Oh, this case is an exception" too often, you do need to revise your theory. But a "small" number of "exceptions" can actually be fine: a theory that says a coin is biased to come up Heads 80% of the time, isn't falsified by a single Tails (and is in fact confirmed if that Tails happens 20% of the time).

At this point, you might ask: okay, but why do I believe this? Anyone can name some variables and sketch a directed graph between them. Why should you believe this particular graph is true?

Ultimately, the reader cannot abdicate responsibility to think it through and decide for herself ... but it seems to me that all six arrows in the graph are things that we separately have a pretty large weight of evidence for, either in published scientific studies, or just informally looking at the world.

The femininity→transition arrow is obvious. The sexual orientation→femininity arrow (representing the fact that gay men are more feminine than straight men), besides being stereotypical folk knowledge, has also been extensively documented, for example by Lippa and by Bailey and Zucker. Evidence for the "v-structure" between sexual orientation, erotic target location erroneousness, and autogynephilia has been documented by Anne Lawrence: furries and amputee-wannabes who want to emulate the objects of their attraction, "look like" "the same thing" as autogynephiles, but pointed at a less conventional erotic target than women. The autogynephilia–transition concordance has been documented by many authors, and I claim the direction of causality is obvious. (If you want to argue that it goes the other way—that some underlying "gender identity" causes both autogynephilia and, separately, the desire to transition, then why does it usually not work that way for androphiles?) The cultural-factors→transition arrow is obvious if you haven't been living under a rock for the last decade.

This has been a qualitative summary of my current thinking. I'm very bullish on thinking in graphical models rather than discrete taxons being the way to go, but it would be a lot more work to pin down all these claims more rigorously—or, to the extent that my graph is wrong, to figure out the correct (or, a more correct, less wrong) graph instead.

(Thanks to the immortal Tailcalled for discussion.)


An Egoist Faith

(Previously: "a laziness born out of resignation and despair, a sense that I've outlived myself, that my story and my world is over, and I'm just enjoying a reasonably comfortable afterlife in the time we have left ...")

People mostly don't do things. They really don't. In order to defy fate and do a thing, you need to Believe in what you're doing, because if you don't Believe, then your motivational system will direct your time and attention to something, anything else that it can Believe in more, like Super Auto Pets.

Thus, it's not possible for a writer to think something like, "I just want to be done with this stupid memoir of religious betrayal that no one should care about, in order to get the Whole Dumb Story out of my system so that I can be over it and move on with my afterlife and maybe work on something that matters instead." (Though someone who self-identifies as a writer can think that.) You can't write in order to be done. It might be possible to produce text under that motivation—though I don't think I've seen it happen myself—but that would only be language-model output, not writing.

If all you really wanted was to be done, you could just—decide to be done, without writing. Just walk away, and let everything left unsaid, remain unsaid. If that doesn't seem satisfactory, it's probably because of some deep, uncancellable conviction that the memoir is not stupid, that the religious leaders did betray you and their faith, that someone should care, that telling the Whole Dumb Story—telling it right, so that every graf sings and hits the exact notes of righteous fury and deconfusion and penetrating portraiture—is part of your life, and not a prerequisite to indulging the part that comes after.

Even if you have to grant, without hesitating, that there is an obvious sense in which these issues are not "important" in the grand scheme of things, that doesn't give you the obligation or even the option to work on something that matters instead. You could produce text that you identify as being "on" something that matters, but that's not work—it's predictably not going to be work that matters on something that matters, which can only be fueled by a power born of having Something to Protect. You can't realistically do work that matters out of resignation, during a reasonably comfortable afterlife after having been taken off the game board that really mattered to you, however "unimportant" it is to ulteriority or the Powers that be.

The only way out is through. If I am going to pivot to work on important things, it's going to be after I've stopped thinking that this is already my afterlife. Only after I've told my Story—not to get it over with, but because I Believe that it matters.