content/2017/from-what-ive-tasted-of-desire.md

   1 Title: From What I've Tasted of Desire
   2 Date: 2017-01-24 22:48
   3 Category: commentary
   4 Tags: bullet-biting, my robot cult
   5
   6 > _Oh, we have to get this right
   7 > Yes, we have to make them see_
   8 >
   9 > —["Ballad of the Crystal Empire"](https://www.youtube.com/watch?v=kESi3hksPg0), _My Little Pony: Friendship Is Magic_
  10
  11 _(Epistemic status: somewhat tongue-in-cheek, but also far more plausible than it has any right to be. Assumes the correctness of [Blanchard's transsexualism typology](https://en.wikipedia.org/wiki/Blanchard's_transsexualism_typology) without arguing it here.)_
  12
  13 So, not a lot of people understand this, but the end of the world is, in fact, nigh. _Conditional_ on civilization not collapsing (which is itself a _kind_ of end of the world), sometime in the next century or so, someone is going to invent better-than-human artificial general intelligence. And from that point on, humans are not really in control of what happens in this planet's future light cone.
  14
  15 This is a counterintuitive point. It's tempting to think that you could program the AI to just obey orders ("Write an adventure novel for my daughter's birthday", "Output the design of a nanofactory") and not otherwise intervene in (or take over) the universe. And maybe [something like that](https://arbital.com/p/genie/) could be made to work, but it's _much_ harder than it looks.
  16
  17 Our simple framework for benchmarking how intelligence has to work is _expected utility maximization_: model the world, use your model to compute a probability distribution over outcomes conditional on choosing to perform an action for some set of actions, and then perform the action with the highest expected utility with respect to your utility function (a mapping from outcomes to ℝ). Any agent that behaves in a way that can't be shoved into this framework is in violation of the [von Neumann–Morgenstern axioms](https://en.wikipedia.org/wiki/Von_Neumann%E2%80%93Morgenstern_utility_theorem), which look so "reasonable" that [we expect any "reasonable" agent to self-modify](https://selfawaresystems.com/2007/11/30/paper-on-the-basic-ai-drives/) to be in harmony with them.
  18
  19 So as AIs get more and more general, more like agents capable of autonomously solving new problems rather than unusually clever-looking ordinary computer programs, we should expect them to look more and more like expected utility maximizers, optimizing the universe with respect to some internal value criterion.
  20
  21 But humans are [a mess of conflicting desires](http://lesswrong.com/lw/l3/thou_art_godshatter/) inherited from our evolutionary and sociocultural history; we don't _have_ a utility function written down anywhere that we can just put in the AI. So if the systems that ultimately run the world end up with a utility function that's _not_ in the incredibly specific class of those we would have wanted if we knew how to translate everything humans want or would-want into a utility function, then the machines disassemble us for spare atoms and tile the universe with _something else_. There's no _reason_ for them to protect human life or forms of life that we would find valuable unless we specifically _code that in_.
  22
  23 This looks like a hard problem. This looks like a _really_ hard problem with _unimaginably_ high stakes: once the handoff of control of our civilization from humans to machines happens, we don't get a second chance to do it over. The ultimate fate of the human species rests on the competence of the AI research community: the inferential power and discipline to _cut through to the correct answer_ and _bet the world on it_, rather than clinging to one's favorite pet hypothesis and leaving science to advance funeral by funeral.
  24
  25 Stereotypically at least, computer programming is _the_ quintessential profession of autogynephilic trans women, although it's unclear how much of this is inherent to the work (a correlation between erotic target location erroneousness and general nerdiness) and how much is just a selection effect (well-to-do programmers with non-customer-facing jobs in Silicon Valley can afford to take the "publicly decide that this is my True Gender Identity" trajectory, whereas businessmen, lawyers, and poor people are trapped in the "secret, shameful crossdressing/dreaming" trajectory).
  26
  27 Thus, the bad [epistemic hygiene](http://lesswrong.com/lw/u/the_ethic_of_handwashing_and_community_epistemic/) habits of the trans community that are required to maintain the socially-acceptable alibi that transitioning is about expressing some innate "gender identity", are necessarily spread to the computer science community, as an [intransigent minority](https://medium.com/incerto/the-most-intolerant-wins-the-dictatorship-of-the-small-minority-3f1f83ce4e15) of trans activist-types successfully enforce social norms mandating that everyone must _pretend not to notice_ that trans women are eccentric men. With social reality placing such tight constraints on perception of actual reality, our chances of developing the advanced epistemology needed to rise to the occasion of solving the alignment problem seem slim at best. (If we can't put our weight down on the right answer to a _really easy_ scientific question like the two-type taxonomy of MtF—which lots of people [just _notice_](https://sillyolme.wordpress.com/2010/02/20/do-as-i-say-not-as-i-do/) without having to do careful research—then what hope do we have for hard problems?)
  28
  29 Essentially, we may be living in a scenario where the world is _literally destroyed specifically because no one wants to talk about their masturbation fantasies_.