check in

[Ultimately_Untrue_Thought.git] / content / drafts / standing-under-the-same-sky.md
diff --git a/content/drafts/standing-under-the-same-sky.md b/content/drafts/standing-under-the-same-sky.md

index e82d186..c0c5a16 100644 (file)
--- a/content/drafts/standing-under-the-same-sky.md
+++ b/content/drafts/standing-under-the-same-sky.md
@@ -550,7 +550,7 @@ Is that ... _not_ evidence of harm to the community? If that's not community-har
  
  On 1 April 2022, Yudkowsky published ["MIRI Announces New 'Death With Dignity' Strategy"](https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy), a cry of despair in the guise of an April Fool's Day post. MIRI didn't know how to align a superintelligence, no one else did either, but AI capabilities work was continuing apace. With no credible plan to avert almost-certain doom, the most we could do now was to strive to give the human race a more dignified death, as measured in log-odds of survival: an alignment effort that doubled the probability of a valuable future from 0.0001 to 0.0002 was worth one information-theoretic bit of dignity.
  
-In a way, "Death With Dignity" isn't really an update. Yudkowsky had always refused to give a probability of success, while maintaining that Friendly AI was ["impossible"](https://www.lesswrong.com/posts/nCvvhFBaayaXyuBiD/shut-up-and-do-the-impossible). Now, he says the probability is approximately zero.
+In a way, "Death With Dignity" isn't really an update. Yudkowsky had always refused to name a "win" probability, while maintaining that Friendly AI was ["impossible"](https://www.lesswrong.com/posts/nCvvhFBaayaXyuBiD/shut-up-and-do-the-impossible). Now, he says the probability is approximately zero.
  
  Paul Christiano, who has a much more optimistic picture of humanity's chances, nevertheless said that he liked the "dignity" heuristic. I like it, too. It—takes some of the pressure off. I [made an analogy](https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy?commentId=R59aLxyj3rvjBLbHg): your plane crashed in the ocean. To survive, you must swim to shore. You know that the shore is west, but you don't know how far. The optimist thinks the shore is just over the horizon; we only need to swim a few miles and we'll probably make it. The pessimist thinks the shore is a thousand miles away and we will surely die. But the optimist and pessimist can both agree on how far we've swum up to this point, and that the most dignified course of action is "Swim west as far as you can."
  
@@ -667,7 +667,7 @@ On the other hand—given that he was paying attention to this #overflow thread
  
  The other chatroom participants mostly weren't buying what I was selling.
  
-A user called April wrote that "the standard dath ilani has internalized almost everything in the sequences": "it's not that the standards are being dropped[;] it's that there's an even higher standard far beyond what anyone on earth has accomplished". (This received a checkmark emoji-react from Yudkowsky, an indication of his agreement.)
+A user called April wrote that "the standard dath ilani has internalized almost everything in the sequences": "it's not that the standards are being dropped[;] it's that there's an even higher standard far beyond what anyone on earth has accomplished". (This received a checkmark emoji-react from Yudkowsky, an indication of his agreement/endorsement.)
  
  Someone else said he was "pretty leery of 'ignore whether models are painful' as a principle, for Earth humans to try to adopt," and went on to offer some thoughts for Earth. I continued to maintain that it was ridiculous that we were talking of "Earth humans" as if there were any other kind—as if rationality in the Yudkowskian tradition wasn't something to aspire to in real life.
  
@@ -793,9 +793,81 @@ I was pleased to get the link to Habryka's comment in front of Yudkowsky, if he
  
  It turned out that I was lying about probably not talking in the server anymore. (Hedging the word "probably" didn't make the claim true, and of course I wasn't _consciously_ lying, but that hardly seems exculpatory.)
  
-The thread went on.
+The next day, I belatedly pointed out that "Keltham thought that not learning about masochists he can never have, was obviously in retrospect what he'd have wanted Civilization to do" seemed to contradict "one thing hasn't changed: the message that you, yourself, should always be trying to infer the true truth". In the first statement, it didn't sound like Keltham thinks it's good that Civilization didn't tell him so that he could figure it how for himself (in accordance with the discipline of "you, yourself, always trying to infer the truth"). It sounded like he was better off not knowing—better off having a _less accurate self-model_ (not having the concept fo "obligate romantic sadism"), better off having a _less accurate world-model_ (thinking that masochism isn't real).
  
-[TODO: regrets and wasted time
+In response to someone positing that dath ilani were choosing to be happier but less accurate predictors, I said that I read a blog post once about why you actually didn't want to do that, linking to [an Internet Archive copy of "Doublethink (Choosing to Be Biased)"](https://web.archive.org/web/20080216204229/https://www.overcomingbias.com/2007/09/doublethink-cho.html) from 2008[^hanson-conceit]—at least, that was _my_ attempted paraphrase; it was possible that I'd extracted a simpler message from it than the author intended.
+
+[^hanson-conceit]: I was really enjoying the "Robin Hanson's blog in 2008" conceit.
+
+A user called Harmless explained the loophole. "Doublethink" was pointing out that decisions that optimize the world for your preferences can't come from nowhere: if you avoid painful thoughts in your map, you damage your ability to steer away from painful outcomes in the territory. However, there was no rule that all the information-processing going into decisions that optimize the world for your preferences had to take place in _your brain_ ...
+
+I saw where they were going and completed the thought: you could build a Friendly AI or a Civilization to see all the dirty things for you, that would make you unhappy to have to see yourself.
+
+Yudkowsky clarified his position:
+
+> My exact word choices often do matter: I said that you should always be trying to infer the truth. With the info you already have. In dath ilan if not in Earth, you might decline to open a box labeled "this info will make you permanently dissatisfied with sex" if the box was labeled by a prediction market.  
+> Trying to avoid inferences seems to me much more internally costly than declining to click on a spoiler box.  
+
+I understood the theory, but I was still extremely skpetical of the practice, assuming the eliezera were even remotely human. Yudkowsky described the practice of "keeping BDSM secret and trying to prevent most sadists from discovering what they are—informing them only when and if they become rich enough or famous enough that they'd have a high probability of successfully obtaining a very rare masochist" as a "basically reasonable policy option that [he] might vote for, not to help the poor dear other people, but to help [his] own counterfactual self."
+
+The problem I saw with this is that becoming rich and famous isn't a purely random exogenous event. In order to make an informed decision about whether or not to put in the effort to try to _become_ rich and famous (as contrasted to choosing a lower-risk or more laid-back lifestyle), you need accurate beliefs about the perks of being rich and famous.
+
+The dilemma of whether to make more ambitious economic choices in pusuit of sexual goals was something that _already_ happens to people on Earth, rather than being hypothetical. I once met a trans woman who spent a lot of her twenties and thirties working very hard to get money for various medical procedures. I think she would be worse off under a censorship regime run by self-styled Keepers who thought it was kinder to prevent _poor people_ from learning about the concept of "transsexualism".
+
+Further discussion established that Yudkowsky was (supposedly) already taking into account the distortion on individuals' decisions, but that the empirical setting of probabilities and utilities happened to be such that ignorance came out on top.
+
+I wasn't sure what my wordcount and diplomacy budget limits for the server were, but I couldn't let go; I kept the thread going on subsequent days. There was something I felt I should be able to convey, if I could just find the right words.
+
+When Word of God says, "trying to prevent most [_X_] from discovering what they are [...] continues to strike me as a basically reasonable policy option", then, separately from the particular value of _X_, I expected people to jump out of their chairs and say, "No! This is wrong! Morally wrong! People can stand what is true about themselves, because they are already doing so!"
+
+And to the extent that I was the only person jumping out of my chair, and there was a party-line response of the form, "Ah, but if it's been decreed by authorial fiat that these-and-such probabilities and utilities take such-and-these values, then in this case, self-knowledge is actually bad under the utilitarian calculus," I wasn't disputing the utilitarian calculus. I was wondering—here I used the "bug" emoji customarily used on Glowfic and adjacent servers to indicate uncertainty about the right words to use—_who destroyed your souls?_
+
+Yudkowsky replied:
+
+> it feels powerfully relevant to me that the people of whom I am saying this are eliezera. I get to decide what they'd want because, unlike with Earth humans, I get to put myself in their shoes. it's plausible to me that the prediction markets say that I'd be sadder if I was exposed to the concept of sadism in a world with no masochists. if so, while I wouldn't relinquish my Art and lose my powers by trying to delude myself about that once I'd been told, I'd consider it a friendly act to keep the info from me—because I have less self-delusional defenses than a standard Earthling, really—and a hostile act to tell me; and if you are telling me I don't get to make that decision for myself because it's evil, and if you go around shouting it from the street corners in dath ilan, then yeah I think most cities don't let you in.
+
+I wish I had thought to ask if he'd have felt the same way in 2008.
+
+Ajvermillion was still baffled at my skepticism: if the author specifies that the world of the story is simple in this-and-such direction, on what grounds could I _disagree_?
+
+I admitted, again, that there was a sense in which I couldn't argue with authorial fiat. But I thought that an author's choice of assumptions reveals something about what they think is true in our world, and commenting on that should be fair game for literary critics. Suppose someone wrote a story and said, "in the world portrayed in this story, everyone is super-great at _kung fu_, and they could beat up everyone from our Earth, but they never have to practice at all."
+
+(Yudkowsky retorted, "...you realize you're describing like half the alien planets in comic books? when did Superman ever get depicted as studying kung fu?" I wish I had thought to admit that, yes, I _did_ hold Eliezer Yudkowsky to a higher standard of consilient worldbuilding than DC Comics. Would he rather I _didn't_?)
+
+Something about innate _kung fu_ world seems fake in a way that seems like a literary flaw. It's not just about plausibility. Innate _kung fu_ skills are scientifically plausible[^instinct] in a way that faster-than-light travel is not. Fiction incorporates unrealistic elements in order to tell a story that has relevace to real human lives. Throwing faster-than-light travel into the universe so that you can do a space opera doesn't make the _people_ fake in the way that Superman's fighting skills are fake.
+
+[^instinct]: All sorts of other instinctual behaviors exist in animals; I don't se why skills humans have to study for years as a "martial art" couldn't be coded into the genome.
+
+Similarly, a world that's claimed by authorial fiat to be super-great at epistemic rationality, but where the people don't have a will-to-truth stronger than their will-to-happiness, felt fake to me. I couldn't _prove_ that it was fake. I agreed with Harmless's case that, _technically_, as far as the Law went, you could build a Civilization or a Friendly AI to see all the ugly things that you preferred not to see.
+
+But if you could—would you? And more importantly, if you would—could you?
+
+It was possible that the attitude I was evincing here was just a difference between the eliezera out of dath ilan and the Zackistani from my medianworld, and that there's nothing more to be said about it. But I didn't think the thing was a _genetic_ trait of the Zackistani! _I_ got it from spending my early twenties obsessively re-reading blog posts that said things like, ["I believe that it is right and proper for me, as a human being, to have an interest in the future [...] One of those interests is the human pursuit of truth [...] I wish to strengthen that pursuit further, in this generation."](https://www.lesswrong.com/posts/anCubLdggTWjnEvBS/your-rationality-is-my-business)
+
+There were definitely communities on Earth where I wasn't allowed in because of my tendency to shout things from street corners, and I respected those people's right to have a safe space for themselves.
+
+But those communities ... didn't call themselves _rationalists_, weren't _pretending_ be to be inheritors of the great tradition of E. T. Jaynes and Robin Dawes and Richard Feynmann. And if they _did_, I think I would have a false advertising complaint against them.
+
+"The eleventh virtue is scholarship. Study many sciences and absorb their power as your own ... unless a prediction market says that would make you less happy," just didn't have the same ring to it. Neither did "The first virtue is curiosity. A burning itch to know is higher than a solemn vow to pursue truth. But higher than both of those, is trusting your Society's institutions to tell you which kinds of knowledge will make you happy"—even if you stipulated by authorial fiat that your Society's institutions are super-competent, such that they're probably right about the happiness thing.
+
+[TODO: Atlas Shrugged quote and children's morals]
+
+[TODO: Yudkowsky tests me]
+
+[TODO: derail with Lintamande]
+
+[TODO: knives, and showing myself out]
+
+------
+
+Anyway, that—briefly (I mean it)—is the Whole Dumb Story about how I wasted the last seven years of my life. It's probably not that interesting? Life goes on—for now. My dayjob contract expired at the end of 2022. In 2023, I've been finishing up this memoir, and posting some other ideas to _Less Wrong_. (I got into another slapfight about me being un-collaborative, which is not interesting enough to summarize.)
+
+After this, the AI situation is looking worrying enough, that I'm thinking I should try to do some more direct xrisk-reduction work, although I haven't definitely selected any particular job or project. (It probably won't matter, but it will be dignified.) Now that the shape of the threat is on the horizon, I think I'm less afraid of being directly involved. Something about having large language models to study in the 'twenties is—grounding, compared to the superstitious fears of the paperclip boogeyman of my nightmares in the 'teens.
+
+Like all intellectuals, as a teenager I imagined that I would write a book. It was always going to be about gender, but I was vaguely imagining a novel, which never got beyond vague imaginings. That was before the Sequences. I'm 35 years old now. I think my intellectual life has succeeded in ways I didn't know how to imagine, before. I think my past self would be proud of this blog—140,000 words of blog posts stapled together is _morally_ a book—once he got over the shock of heresy.
+
+[TODO conclusion, cont'd—
   * Do I have regrets about this Whole Dumb Story? A lot, surely—it's been a lot of wasted time. But it's also hard to say what I should have done differently; I could have listened to Ben more and lost faith Yudkowsky earlier, but he had earned a lot of benefit of the doubt?
+ * even young smart AGPs who can appreciate my work have still gotten pinkpilled
   * less drama (in my youth, I would have been proud that at least this vice was a feminine trait; now, I prefer to be good even if that means being a good man)
  ]