check in

[Ultimately_Untrue_Thought.git] / content / drafts / standing-under-the-same-sky.md
diff --git a/content/drafts/standing-under-the-same-sky.md b/content/drafts/standing-under-the-same-sky.md

index 2da316c..4ada27e 100644 (file)
--- a/content/drafts/standing-under-the-same-sky.md
+++ b/content/drafts/standing-under-the-same-sky.md
@@ -47,7 +47,7 @@ But pushing on embryo selection only makes sense as an intervention for optimizi
  
  But if you think the only hope for there _being_ a future flows through maintaining influence over what large tech companies are doing as they build transformative AI, declining to contradict the state religion makes more sense—if you don't have _time_ to win a culture war, because you need to grab hold of the Singularity (or perform a [pivotal act](https://arbital.com/p/pivotal/) to prevent it) _now_. If the progressive machine marks you as a transphobic bigot, the machine's functionaries at OpenAI or Meta AI Research are less likely to listen to you when you explain why [their safety plan](https://openai.com/blog/our-approach-to-alignment-research/) won't work, or why they should have a safety plan at all.
  
-(I remarked to "Wilhelm" in June 2022 that DeepMind [changing its Twitter avatar to a rainbow variant of their logo for Pride month](https://web.archive.org/web/20220607123748/https://twitter.com/DeepMind) was a bad sign.)
+(I remarked to "Wilhelm" in mid-2022 that DeepMind [changing its Twitter avatar to a rainbow variant of their logo for Pride month](https://web.archive.org/web/20220607123748/https://twitter.com/DeepMind) was a bad sign.)
  
  So isn't there a story here where I'm the villain, willfully damaging humanity's chances of survival by picking unimportant culture-war fights in the xrisk-reduction social sphere, when _I know_ that the sphere needs to keep its nose clean in the eyes of the progressive egregore? _That's_ why Yudkowsky said the arguably-technically-misleading things he said about my Something to Protect: he _had_ to, to keep our collective nose clean. The people paying attention to contemporary politics don't know what I know, and can't usefully be told. Isn't it better for humanity if my meager talents are allocated to making AI go well? Don't I have a responsibility to fall in line and take one for the team—if the world is at stake?
  
@@ -99,20 +99,71 @@ So, naïvely, doesn't Yudkowsky's "personally prudent to post your agreement wit
  
  I can think of two reasons why the naïve objection might fail. (And who can say but that a neutral expert witness on decision theory wouldn't think of more?)
  
-First, the true decision theory is subtler than "defy anything that you can commonsensically pattern-match as looking like 'extortion'"; the case for resisting extortion specifically rests on there existing a subjunctive dependence between your decision and the extortionist's decision (they threaten _because_ you'll give in, or won't bother _because_ you won't), and the relevant subjunctive dependence doesn't obviously pertain in the real-life science intellectual _vs._ social justice mob match-up. If the mob has been trained from past experience to predict that their targets will give in, should you defy them now in order to somehow make your current situation "less real"? Depending on the correct theory of logical counterfactuals, the right stance might be ["We don't negotiate with terrorists, but we do appease bears"](/2019/Dec/political-science-epigrams/) (because the bear's response isn't calculated based on our response), and the forces of political orthodoxy might be relevantly bear-like.
+First, the true decision theory is subtler than "defy anything that you can commonsensically pattern-match as looking like 'extortion'"; the case for resisting extortion specifically rests on there existing a subjunctive dependence between your decision and the extortionist's decision: they threaten _because_ you'll give in, or don't bother _because_ you won't.
  
-On the other hand, the relevant subjunctive dependence doesn't obviously _not_ pertain, either! Parsing social justice as an agentic "threat" rather than a non-agentic obstacle like an avalanche, does seem to line up with the fact that people punish heretics, who dissent from an ideological group, more than infidels, who were never part of the group to begin with—_because_ heretics are more extortable—more vulnerable to social punishment from the original group.
+Okay, but then how do I compute this "subjunctive dependence" thing? Presumably it has something to do with the extortionist's decisionmaking process incuding a model of the target. How good does that model have to be for it to "count"?
  
-Which brings me to the second reason the naïve anti-extortion argument might fail: what counts as "extortion" depends on the relevant "property rights", what the "default" action is. If having free speech is the default, being excluded from the coalition for defying the orthodoxy could be construed as extortion. But if _being excluded from the coalition_ is the default, maybe toeing the line of orthodoxy is the price you need to pay in order to be included.
+I don't know—and if I don't know, I can't say that the relevant subjunctive dependence obviously pertains in the real-life science intellectual _vs._ social justice mob match-up. If the mob has been trained from past experience to predict that their targets will give in, should you defy them now in order to somehow make your current predicament "less real"? Depending on the correct theory of logical counterfactuals, the correct stance might be "We don't negotiate with terrorists, but [we do appease bears](/2019/Dec/political-science-epigrams/) and avoid avalanches" (because neither the bear's nor the avalanche's behavior is calculated based on our response), and the forces of political orthodoxy might be relevantly bear- or avalanche-like.
  
-[TODO: defying threats, cont'd—
+On the other hand, the relevant subjunctive dependence doesn't obviously _not_ pertain, either! Yudkowsky does seem to endorse commonsense pattern-matching to "extortion" in contexts like nuclear diplomacy. Or I remember back in 'aught-nine, Tyler Emerson was caught embezzling funds from the Singularity Institute, and SingInst made it a point of pride to prosecute on decision-theoretic grounds, when a lot of other nonprofits would have quietly and causal-decision-theoretically covered it up to spare themselves the embarrassment. Parsing social justice as an agentic "threat" rather than a non-agentic obstacle like an avalanche, does seem to line up with the fact that people punish heretics (who dissent from an ideological group) more than infidels (who were never part of the group to begin with), _because_ heretics are more extortable—more vulnerable to social punishment from the original group.
  
- * Yudkowsky does seemingly back commonsensical interpretations, re voting, or how, back in 'aught-nine, SingInst had made a point of prosecuting Tyler Emerson, citing decision theory
+Which brings me to the second reason the naïve anti-extortion argument might fail: [what counts as "extortion" depends on the relevant "property rights", what the "default" action is](https://www.lesswrong.com/posts/Qjaaux3XnLBwomuNK/countess-and-baron-attempt-to-define-blackmail-fail). If having free speech is the default, being excluded from the dominant coalition for defying the orthodoxy could be construed as extortion. But if _being excluded from the coalition_ is the default, maybe toeing the line of orthodoxy is the price you need to pay in order to be included.
  
- * Curtis Yarvin has compared Yudkowsky to Sabbatai Zevi (/2020/Aug/yarvin-on-less-wrong/), and I've got to say the comparison is dead-on. Sabbatai Zevi was facing much harsher coercion: his choices were to convert to Islam or be impaled https://en.wikipedia.org/wiki/Sabbatai_Zevi#Conversion_to_Islam
+Yudkowsky has [a proposal for how bargaining should work between agents with different notions of "fairness"](https://www.lesswrong.com/posts/z2YwmzuT7nWx62Kfh/cooperating-with-agents-with-different-ideas-of-fairness).
+
+Suppose Edgar and Fiona are splitting a pie, and if they can't initially agree on how to split it, they have to fight over it until they do, destroying some of the pie in the process. Edgar thinks the fair outcome is that they each get half the pie. Fiona claims that she contributed more ingredients to the baking process and that it's therefore fair that she gets 75% of the pie, pledging to fight if offered anything less.
+
+If Edgar were a causal decision theorist, he might agree to the 75/25 split, reasoning that 25% of the pie is better than fighting until the pie is destroyed. Yudkowsky argues that this is irrational: if Edgar is willing to agree to a 75/25 split, then Fiona has no incentive not to adopt such a self-favoring definition of "fairness". (And _vice versa_ if Fiona's concept of fairness is the "correct" one.)
+
+Instead, Yudkowsky argues, Edgar should behave so as to only do worse than the fair outcome if Fiona _also_ does worse: for example, by accepting a 48/32 split (after 100−(32+48) = 20% of the pie has been destroyed by the costs of fighting) or an 42/18 split (where 40% of the pie has been destroyed). This isn't Pareto-optimal (it would be possible for both Edgar and Fiona to get more pie by reaching an agreement with less fighting), but it's worth it to Edgar to burn some of Fiona's utility fighting in order to resist being exploited by her, and at least it's better than the equilibrium where the pie gets destroyed (which is Nash because neither party can unilaterally stop fighting).
+
+It seemed to me that in the contest over the pie of Society's shared map, the rationalist Caliphate was letting itself get exploited by the progressive Egregore, doing worse than the fair outcome without dealing any damage to the egregore in return. Why?
+
+The logic of "dump stats", presumably. Bargaining to get AI risk on the shared map—not even to get it taken seriously as we would count "taking it seriously", but just acknowledged at all—was hard enough. Trying to challenge the Egregore about an item that it actually cared about would trigger more fighting than we could afford.
+
+I told the illustration about splitting a pie as a symmetrical story: if Edgar and Fiona destroy the pie fighting, than neither of them get any pie. But in more complicated scenarios (including the real world), there was no guarantee that non-Pareto Nash equilibria were equally bad for everyone.
+
+I'd had a Twitter exchange with Yudkowsky in January 2020 that revealed some of his current-year thinking about Nash equilibria. I [had Tweeted](https://twitter.com/zackmdavis/status/1206718983115698176):
+
+> 1940s war criminal defense: "I was only following orders!"  
+> 2020s war criminal defense: "I was only participating in a bad Nash equilibrium that no single actor can defy unilaterally!"
+
+(The language of the latter being [a reference to Yudkowsky's _Inadequate Equilibria_](https://equilibriabook.com/molochs-toolbox/).)
+
+Yudkowsky quote-Tweet dunked on me:
+
+> [TODO: well, YES]
+
+I pointed out the voting case as one where he seemed to be disagreeing with his past self, linking to 2008's "Stop Voting for Nincompoops". What changed his mind?
+
+"Improved model of the social climate where revolutions are much less startable or controllable by good actors," he said. "Having spent more time chewing on Nash equilibria, and realizing that the trap is _real_ and can't be defied away even if it's very unpleasant."
+
+In response to Sarah Constantin mentioning that there was no personal cost to voting third-party, Yudkowsky pointed out that the problem was the third-party spoiler effect, not personal cost: "People who refused to vote for Hillary didn't pay the price, kids in cages did, but that still makes the action nonbest."
+
+[TODO: look up the extent to which "kids in cages" were also a thing during the Obama and Biden administrations]
+
+I asked what was wrong with the disjunction from "Stop Voting for Nincompoops", where the earlier Yudkowsky had written that it's hard to see who should accept the argument to vote for the lesser of two evils, but refuse to accept the argument against voting because it won't make a difference. Unilaterally voting for Clinton doesn't save the kids!
+
+"Vote when you're part of a decision-theoretic logical cohort large enough to change things, or when you're worried about your reputation and want to be honest about whether you voted," Yudkowsky replied.
+
+"How do I compute whether I'm in a large enough decision-theoretic cohort?" I asked. Did we know that, or was that still on the open problems list?
+
+Yudkowsky said that he traded his vote for a Clinton swing state vote, partially hoping that that would scale [...]
+
+
+
+ * So maybe he doesn't think he's part of a decision-theoretic logical cohort large enough to resist the egregore, and he's also not worried about his reputation for resisting the egregore
+ * If his reptuation in the eyes of people like me just isn't that valuable, I guess I can't argue with that
+
+Curtis Yarvin [likes to compare](/2020/Aug/yarvin-on-less-wrong/) Yudkowsky to [Sabbatai Zevi](https://en.wikipedia.org/wiki/Sabbatai_Zevi#Conversion_to_Islam), the Jewish religious leader who was purported to be the Messiah, who converted to Islam under coercion from the Ottomans. "I know, without a shadow of a doubt, that in the same position, Eliezer Yudkowsky would also convert to Islam," said Yarvin.
+
+ * But this isn't necessarily crazy. Zevi was facing some very harsh coercion: convert or be impaled.
+ * My real question is, in the same position, would Sabbatai Zevi declare that 30% of the ones with penises are actually women?
  
  ]
  
+-----
+
  I like to imagine that they have a saying out of dath ilan: once is happenstance; twice is coincidence; _three times is hostile optimization_.
  
  I could forgive him for taking a shit on d4 of my chessboard (["at least 20% of the ones with penises are actually women"](https://www.facebook.com/yudkowsky/posts/10154078468809228)).
@@ -530,7 +581,7 @@ In [the story about how Merrin came to the attention of dath ilan's bureau of Ex
  
  Notwithstanding that Rittaen can be Watsonianly assumed to have detailed neuroscience skills that the author Doylistically doesn't know how to write, I am entirely unimpressed by the assertion that this idea is somehow _dangerous_, a secret that only Keepers can bear, rather than something _Merrin herself should be clued into_. "It's not [Rittaen's] place to meddle just because he knows Merrin better than Merrin does," we're told.
  
-In the same story, an agent from Exception Handling [tells Merrin that the bureau's Fake Conspiracy section is running an operation to plant evidence that Sparashki (the fictional alien Merrin happens to be dressed up as) are real](https://glowfic.com/replies/1860952#reply-1860952), and asks Merrin not to contradict this, and Merrin just ... goes along with it. (Elsewhere in the text, we're told that claiming to be a Sparashki isn't "lying", because no one would _expect_ someone to tell the truth in that situation.) It's in-character for Merrin to go along with it, because she's a pushover. My question is, why is it okay that Exception Handling has a Fake Conspiracies section, any more than it would have been if FTX or Enron explicitly had a Fake Accounting department? (Because dath ilan are the designated good guys? Well, so was FTX.)
+In the same story, an agent from Exception Handling [tells Merrin that the bureau's Fake Conspiracy section is running an operation to plant evidence that Sparashki (the fictional alien Merrin happens to be dressed up as) are real](https://glowfic.com/replies/1860952#reply-1860952), and asks Merrin not to contradict this, and Merrin just ... goes along with it. It's in-character for Merrin to go along with it, because she's a pushover. My question is, why is it okay that Exception Handling has a Fake Conspiracies section, any more than it would have been if FTX or Enron explicitly had a Fake Accounting department? (Because dath ilan are the designated good guys? Well, so was FTX.)
  
  As another notable example of dath ilan hiding information for the alleged greater good, in Golarion, Keltham discovers that he's a sexual sadist, and deduces that Civilization has deliberately prevented him from realizing this, because there aren't enough corresponding masochists to go around in dath ilan. Having concepts for "sadism" and "masochism" as variations in human psychology would make sadists like Keltham sad about the desirable sexual experiences they'll never get to have, so Civilization arranges for them to _not be exposed to knowledge that would make them sad, because it would make them sad_ (!!).
  
@@ -579,7 +630,7 @@ I started a new thread to complain about the attitude I was seeing (Subject: "No
  
  I wasn't buying the excuse that secret-Keeping practices that wouldn't be OK on Earth were somehow OK on dath ilan, which was asserted by authorial fiat to be sane and smart and benevolent enough to make it work. Or if I couldn't argue with authorial fiat: the reasons why it would be bad on Earth (even if it wouldn't be bad on dath ilan) are reasons why _fiction about dath ilan is bad for Earth_.
  
-And just—back in the 'aughts, Robin Hanson had this really great blog called _Overcoming Bias_. (You probably haven't heard of it.) I wanted that _vibe_ back, of Robin Hanson's blog in 2008—the will to _just get the right answer_, without all this galaxy-brained hand-wringing about who the right answer might hurt.
+And just—back in the 'aughts, Robin Hanson had this really great blog called _Overcoming Bias_. (You probably haven't heard of it, I said.) I wanted that _vibe_ back, of Robin Hanson's blog in 2008—the will to _just get the right answer_, without all this galaxy-brained hand-wringing about who the right answer might hurt.
  
  I would have expected a subculture descended from the memetic legacy of Robin Hanson's blog in 2008 to respond to that tripe about protecting people from being destroyed by the truth as a form of "recognizing independent agency" with something like—