check in

[Ultimately_Untrue_Thought.git] / content / drafts / standing-under-the-same-sky.md
diff --git a/content/drafts/standing-under-the-same-sky.md b/content/drafts/standing-under-the-same-sky.md

index 3b08f94..4ada27e 100644 (file)
--- a/content/drafts/standing-under-the-same-sky.md
+++ b/content/drafts/standing-under-the-same-sky.md
@@ -111,24 +111,47 @@ Which brings me to the second reason the naïve anti-extortion argument might fa
  
  Yudkowsky has [a proposal for how bargaining should work between agents with different notions of "fairness"](https://www.lesswrong.com/posts/z2YwmzuT7nWx62Kfh/cooperating-with-agents-with-different-ideas-of-fairness).
  
-Suppose Edgar and Fiona are splitting a pie, and if they can't agree on how to split it, they have to fight over it, destroying some of the pie in the process. Edgar thinks the fair outcome is that they each get half the pie. Fiona claims that she contributed more ingredients to the baking process and that it's therefore fair that she gets 75% of the pie, pledging to fight if offered anything less.
+Suppose Edgar and Fiona are splitting a pie, and if they can't initially agree on how to split it, they have to fight over it until they do, destroying some of the pie in the process. Edgar thinks the fair outcome is that they each get half the pie. Fiona claims that she contributed more ingredients to the baking process and that it's therefore fair that she gets 75% of the pie, pledging to fight if offered anything less.
  
-If Edgar were a causal decision theorist, he would agree to the 75/25 split, if 25% of the pie is better than fighting. Yudkowsky argues that this is irrational: if Edgar is willing to agree to a 75/25 split, then Fiona has no incentive not to adopt such a self-favoring definition of "fairness". (And _vice versa_ if Fiona's concept of fairness is the "correct" one.)
+If Edgar were a causal decision theorist, he might agree to the 75/25 split, reasoning that 25% of the pie is better than fighting until the pie is destroyed. Yudkowsky argues that this is irrational: if Edgar is willing to agree to a 75/25 split, then Fiona has no incentive not to adopt such a self-favoring definition of "fairness". (And _vice versa_ if Fiona's concept of fairness is the "correct" one.)
  
-Instead, Yudkowsky argues, Edgar should behave so as to only do worse than the fair outcome if Fiona _also_ does worse: for example, by accepting a 32/48 split (where 100−(32+48) = 20% of the pie has been destroyed by the costs of fighting) or an 18/42 split (where 40% of the pie has been destroyed).
+Instead, Yudkowsky argues, Edgar should behave so as to only do worse than the fair outcome if Fiona _also_ does worse: for example, by accepting a 48/32 split (after 100−(32+48) = 20% of the pie has been destroyed by the costs of fighting) or an 42/18 split (where 40% of the pie has been destroyed). This isn't Pareto-optimal (it would be possible for both Edgar and Fiona to get more pie by reaching an agreement with less fighting), but it's worth it to Edgar to burn some of Fiona's utility fighting in order to resist being exploited by her, and at least it's better than the equilibrium where the pie gets destroyed (which is Nash because neither party can unilaterally stop fighting).
  
-[TODO: defying threats, cont'd—
- 
- * How does this map on to the present situation, though? Does he think he's playing Nash, or does he think he's getting gains-from-trade? (Either figure this out, or write some smart sentences about my confusion)
+It seemed to me that in the contest over the pie of Society's shared map, the rationalist Caliphate was letting itself get exploited by the progressive Egregore, doing worse than the fair outcome without dealing any damage to the egregore in return. Why?
  
+The logic of "dump stats", presumably. Bargaining to get AI risk on the shared map—not even to get it taken seriously as we would count "taking it seriously", but just acknowledged at all—was hard enough. Trying to challenge the Egregore about an item that it actually cared about would trigger more fighting than we could afford.
  
-https://twitter.com/zackmdavis/status/1206718983115698176
-> 1940s war criminal defense: "I was only following orders!"
+I told the illustration about splitting a pie as a symmetrical story: if Edgar and Fiona destroy the pie fighting, than neither of them get any pie. But in more complicated scenarios (including the real world), there was no guarantee that non-Pareto Nash equilibria were equally bad for everyone.
+
+I'd had a Twitter exchange with Yudkowsky in January 2020 that revealed some of his current-year thinking about Nash equilibria. I [had Tweeted](https://twitter.com/zackmdavis/status/1206718983115698176):
+
+> 1940s war criminal defense: "I was only following orders!"  
  > 2020s war criminal defense: "I was only participating in a bad Nash equilibrium that no single actor can defy unilaterally!"
  
+(The language of the latter being [a reference to Yudkowsky's _Inadequate Equilibria_](https://equilibriabook.com/molochs-toolbox/).)
+
+Yudkowsky quote-Tweet dunked on me:
+
+> [TODO: well, YES]
+
+I pointed out the voting case as one where he seemed to be disagreeing with his past self, linking to 2008's "Stop Voting for Nincompoops". What changed his mind?
+
+"Improved model of the social climate where revolutions are much less startable or controllable by good actors," he said. "Having spent more time chewing on Nash equilibria, and realizing that the trap is _real_ and can't be defied away even if it's very unpleasant."
+
+In response to Sarah Constantin mentioning that there was no personal cost to voting third-party, Yudkowsky pointed out that the problem was the third-party spoiler effect, not personal cost: "People who refused to vote for Hillary didn't pay the price, kids in cages did, but that still makes the action nonbest."
+
+[TODO: look up the extent to which "kids in cages" were also a thing during the Obama and Biden administrations]
+
+I asked what was wrong with the disjunction from "Stop Voting for Nincompoops", where the earlier Yudkowsky had written that it's hard to see who should accept the argument to vote for the lesser of two evils, but refuse to accept the argument against voting because it won't make a difference. Unilaterally voting for Clinton doesn't save the kids!
+
+"Vote when you're part of a decision-theoretic logical cohort large enough to change things, or when you're worried about your reputation and want to be honest about whether you voted," Yudkowsky replied.
+
+"How do I compute whether I'm in a large enough decision-theoretic cohort?" I asked. Did we know that, or was that still on the open problems list?
+
+Yudkowsky said that he traded his vote for a Clinton swing state vote, partially hoping that that would scale [...]
+
+
  
- * I asked him why he changed his mind about voting
- * "Vote when you're part of a decision-theoretic logical cohort large enough to change things, or when you're worried about your reputation and want to be honest about whether you voted."
   * So maybe he doesn't think he's part of a decision-theoretic logical cohort large enough to resist the egregore, and he's also not worried about his reputation for resisting the egregore
   * If his reptuation in the eyes of people like me just isn't that valuable, I guess I can't argue with that
  
@@ -607,7 +630,7 @@ I started a new thread to complain about the attitude I was seeing (Subject: "No
  
  I wasn't buying the excuse that secret-Keeping practices that wouldn't be OK on Earth were somehow OK on dath ilan, which was asserted by authorial fiat to be sane and smart and benevolent enough to make it work. Or if I couldn't argue with authorial fiat: the reasons why it would be bad on Earth (even if it wouldn't be bad on dath ilan) are reasons why _fiction about dath ilan is bad for Earth_.
  
-And just—back in the 'aughts, Robin Hanson had this really great blog called _Overcoming Bias_. (You probably haven't heard of it.) I wanted that _vibe_ back, of Robin Hanson's blog in 2008—the will to _just get the right answer_, without all this galaxy-brained hand-wringing about who the right answer might hurt.
+And just—back in the 'aughts, Robin Hanson had this really great blog called _Overcoming Bias_. (You probably haven't heard of it, I said.) I wanted that _vibe_ back, of Robin Hanson's blog in 2008—the will to _just get the right answer_, without all this galaxy-brained hand-wringing about who the right answer might hurt.
  
  I would have expected a subculture descended from the memetic legacy of Robin Hanson's blog in 2008 to respond to that tripe about protecting people from being destroyed by the truth as a form of "recognizing independent agency" with something like—