From 03f445fa6dada1e48bba1e4300bc455b2d0a4972 Mon Sep 17 00:00:00 2001 From: "M. Taylor Saotome-Westlake" Date: Sat, 7 Dec 2019 09:20:47 -0800 Subject: [PATCH] "I Tell Myself" 7 December drafting session 2: fucking with the model --- notes/i-tell-myself-sections.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/notes/i-tell-myself-sections.md b/notes/i-tell-myself-sections.md index 7d99dcc..ef004b8 100644 --- a/notes/i-tell-myself-sections.md +++ b/notes/i-tell-myself-sections.md @@ -119,7 +119,7 @@ Someone asked me: "Wouldn't it be embarrassing if the community solved Friendly But the _reason_ it seemed _at all_ remotely plausible that our little robot cult could be pivotal in creating Utopia forever was _not_ "[Because we're us](http://benjaminrosshoffman.com/effective-altruism-is-self-recommending/), the world-saving good guys", but rather _because_ we were going to discover and refine the methods of _systematically correct reasoning_. -If the people _marketing themselves_ as the good guys who are going to save the world using systematically correct reasoning are _not actually interested in doing systematically correct reasoning_ (because systematically correct reasoning leads to two or three conclusions that are politically "impossible" to state clearly in public, and no one has the guts to [_not_ shut up and thereby do the politically impossible](https://www.lesswrong.com/posts/nCvvhFBaayaXyuBiD/shut-up-and-do-the-impossible)), that's arguably _worse_ than the situation where +If the people _marketing themselves_ as the good guys who are going to save the world using systematically correct reasoning are _not actually interested in doing systematically correct reasoning_ (because systematically correct reasoning leads to two or three conclusions that are politically "impossible" to state clearly in public, and no one has the guts to [_not_ shut up and thereby do the politically impossible](https://www.lesswrong.com/posts/nCvvhFBaayaXyuBiD/shut-up-and-do-the-impossible)), that's arguably _worse_ than the situation where the community doesn't exist at all— ----- @@ -185,11 +185,11 @@ The Popular Author definitely isn't trying to be cult leader. He just The "national borders" metaphor is particularly galling if—[unlike](https://slatestarcodex.com/2015/01/31/the-parable-of-the-talents/) [the](https://slatestarcodex.com/2013/06/30/the-lottery-of-fascinations/) Popular Author—you _actually know the math_. -If I have a "blegg" concept for blue egg-shaped objects—uh, this is [our](https://www.lesswrong.com/posts/4FcxgdvdQP45D6Skg/disguised-queries) [standard](https://www.lesswrong.com/posts/yFDKvfN6D87Tf5J9f/neural-categories) [example](https://www.lesswrong.com/posts/yA4gF5KrboK2m2Xu7/how-an-algorithm-feels-from-inside), just [roll with it](http://unremediatedgender.space/2018/Feb/blegg-mode/)—what that _means_ is that (at some appropriate level of abstraction) there's a little [Bayesian network](https://www.lesswrong.com/posts/hzuSDMx7pd2uxFc5w/causal-diagrams-and-causal-models) in my head with "blueness" and "eggness" observation nodes hooked up to a central "blegg" category-membership node, such that if I see a black-and-white photograph of an egg-shaped object, I can use the observation of its shape to update my beliefs about its blegg-category-membership, and then use my beliefs about category-membership to update my beliefs about its blueness. This cognitive algorithm is useful if we live in a world of objects that have the appropriate structure—if the joint distribution P(blegg, blueness, eggness) approximately factorizes as P(blegg)·P(blueness|blegg)·P(eggness|blegg). +If I have a "blegg" concept for blue egg-shaped objects—uh, this is [our](https://www.lesswrong.com/posts/4FcxgdvdQP45D6Skg/disguised-queries) [standard](https://www.lesswrong.com/posts/yFDKvfN6D87Tf5J9f/neural-categories) [example](https://www.lesswrong.com/posts/yA4gF5KrboK2m2Xu7/how-an-algorithm-feels-from-inside), just [roll with it](http://unremediatedgender.space/2018/Feb/blegg-mode/)—what that _means_ is that (at some appropriate level of abstraction) there's a little [Bayesian network](https://www.lesswrong.com/posts/hzuSDMx7pd2uxFc5w/causal-diagrams-and-causal-models) in my head with "blueness" and "eggness" observation nodes hooked up to a central "blegg" category-membership node, such that if I see a black-and-white photograph of an egg-shaped object, I can use the observation of its shape to update my beliefs about its blegg-category-membership, and then use my beliefs about category-membership to update my beliefs about its blueness. This cognitive algorithm is useful if we live in a world where objects that have the appropriate statistical structure—if the joint distribution P(blegg, blueness, eggness) approximately factorizes as P(blegg)·P(blueness|blegg)·P(eggness|blegg). -"Category boundaries" are just a _visual metaphor_ for the math: the set of things I'll classify as a blegg with probability greater than _p_ is conveniently _visualized_ as an area with a boundary in blueness–eggness space. +"Category boundaries" are just a _visual metaphor_ for the math: the set of things I'll classify as a blegg with probability greater than _p_ is conveniently _visualized_ as an area with a boundary in blueness–eggness space. If you _don't understand_ the relevant math and philosophy—or are pretending not to understand only and exactly when it's politically convenient—you might think you can redraw the boundary any way you want. But you can't, because the "boundary" visualization is _derived from_ a statistical model which corresponds to _empirically testable predictions about the real world_. -[wireheading and war are the only two reasons to] +Fucking with category boundaries corresponds to fucking with the model, which corresponds to fucking with your ability to interpret sensory data. The only two reasons you could _possibly_ want to do this would be to wirehead yourself (corrupt your map to make the territory look nicer than it really is, making yourself _feel_ happier at the cost of sabotaging your ability to navigate the real world) or as information warfare (corrupt shared maps to sabotage other agents' ability to navigate the real world, in a way such that you benefit from their confusion). ----- -- 2.17.1