drafting "Subspatial Distribution Overlap"

[Ultimately_Untrue_Thought.git] / content / drafts / subspatial-distribution-overlap-and-cancellable-stereotypes-or-gender-identity-as-cognitive-illusion.md
diff --git a/content/drafts/subspatial-distribution-overlap-and-cancellable-stereotypes-or-gender-identity-as-cognitive-illusion.md b/content/drafts/subspatial-distribution-overlap-and-cancellable-stereotypes-or-gender-identity-as-cognitive-illusion.md

index 5ef31af..7d6e2b7 100644 (file)
--- a/content/drafts/subspatial-distribution-overlap-and-cancellable-stereotypes-or-gender-identity-as-cognitive-illusion.md
+++ b/content/drafts/subspatial-distribution-overlap-and-cancellable-stereotypes-or-gender-identity-as-cognitive-illusion.md
@@ -4,26 +4,46 @@ Category: commentary
  Tags: categorization, epistemology
  Status: draft
  
  Tags: categorization, epistemology
  Status: draft
  
+> Do not at the outset of your career make the all too common error of mistaking names for things. Names are only conventional signs for identifying things. Things are the reality that counts. If a thing is despised, either because of ignorance or because it is despicable, you will not alter matters by changing its name.
+>
+> —[W. E. B. duBois](http://www.virginia.edu/woodson/courses/aas102%20%28spring%2001%29/articles/names/dubois.htm)
+
  A common misconception about words is that they have definitions: look up the definition, and that tells you everything to know about that word ... right?
  
  It can't _actually_ work that way—not in principle. The problem—one of them, anyway—is that with a sufficiently active imagination, you can imagine edge cases that satisfy the definition, but aren't what you _really mean_ by the word.
  
  A common misconception about words is that they have definitions: look up the definition, and that tells you everything to know about that word ... right?
  
  It can't _actually_ work that way—not in principle. The problem—one of them, anyway—is that with a sufficiently active imagination, you can imagine edge cases that satisfy the definition, but aren't what you _really mean_ by the word.
  
-What's a _woman_? An adult human female. (Let's [not play dumb about this](/2018/Apr/reply-to-the-unit-of-caring-on-adult-human-females/) today.) Okay, but then what's a _female_? One common and perfectly serviceable definition: of the sex that produces larger gametes—ova, eggs.
+What's a _woman_? An adult human female. (Let's [not play dumb about this](/2018/Apr/reply-to-the-unit-of-caring-on-adult-human-females/) today.) Okay, but then what does _female_ mean? One common and perfectly serviceable definition: of the sex that produces larger gametes—ova, eggs.
+
+That's one common and perfectly serviceable definition in the paltry, commonplace _real_ world—but not in _the world of the imagination!_ We could _imagine_ the existence of a creature that looks and acts exactly like an adult human male down to the finest details, _except_ that its (his?) gonads produce eggs, not sperm! So one might argue that this would be a _female_ and presumably a _woman_, according to our definitions, yes?
+
+But if you saw this person on the street or even slept in their bed, you wouldn't want to call them a woman, because everything about them that you can observe looks like that of an adult human male. If you're not a reproductive health lab tech and don't look at the photographs in biology textbooks, you'll never _see_ the gametes someone's body produces. (You can see male semen, but the individual spermatozoa are too small to look at without a microscope; people [didn't even know that ova and sperm _existed_ until the 17th century](https://onlinelibrary.wiley.com/doi/full/10.1111/j.1439-0531.2012.02105.x).) Does that mean this common definition of _female_ isn't perfectly serviceable after all?
+
+No, because humans whose gametes produce eggs but appear male in every other aspect, are something I just made up out of thin air for the purposes of this blog post. They don't exist in the real world. What this really shows is that the cognitive technology of "words" having "definitions" doesn't work in _the world of the imagination_, because _the world of the imagination_ encompasses (at a minimum) _all possible configurations of matter_. Words are [short messages that compress a lot of information](https://www.lesswrong.com/posts/mB95aqTSJLNR9YyjH/message-length), but what it _means_ for the world to contain information is that some things in the world are more probable than others.
+
+To see why, let's take a brief math detour and review some elementary information theory. Instead of the messy real world, take a restricted setting: the world of strings of 20 bits. Suppose you wanted to devise an efficient _code_ to represent elements of this world with _shorter_ strings, such that you could say (for example) `01100` (in the efficient code, using just 5 bits) and the people listening to you would know that what you actually saw in the world was (for example) `01100001110110000010`.
+
+If every length-20 bitstring in the world has equal probability, this can't be done: there are 2<sup>20</sup> (= 1,048,576) length-20 strings and only 2<sup>5</sup> (= 32) length-5 codewords; there aren't enough codewords to go around to cover all the strings in this world. It's worse than that: if every length-20 bitstring in the world has equal probability, you can't have labels that compress information _at all_: if you said that the first 19 bits of something you saw in the world were `0110000111011000001`, the people listening to you would be completely clueless as to whether the whole thing was `0110000111011000001`**`0`** or `0110000111011000001`**`1`**. _Locating_ a book in the [Jose Luis Borges's Library of Babel](TODO: linky and accents) is mathematically equivalent to writing it yourself.
+
+However, in the world of a _non-uniform probability distribution_ over strings of 20 bits, compression—and therefore language—_is_ possible . If almost all the bitstrings you actually saw in the world were either all-zeros (`00000000000000000000`) or all-ones (`11111111111111111111`), with a very few exceptions that were still _mostly_ one bit or the other (like `00010001000000000000` or `11101111111011011111`), then you could devise a very efficient encoding.
  
  
-That's one common and perfectly serviceable definition in the paltry, commonplace _real_ world—but not in _the world of the imagination!_ We could _imagine_ the existence of a creature that looks and acts exactly like an adult human male down to the finest details, _except_ that its (his?) gonads produce eggs, not sperm! So that would be a _female_ and presumably a _woman_, according to our definitions, yes?
+To _be_ efficient, you'd want to reserve the shortest words for the most common case: like `00` in the code to mean `00000000000000000000` in the world and `01` to mean `11111111111111111111`. Then you could have slightly-longer words that encode all the various exceptions, like maybe the merely-eleven-bit encoding `10110101110` could represent `00100010000000000000` in the world (`1` to indicate that this is one of the exceptions, a following `0` to indicate that _most_ of the bits are `0`, followed by the [Elias self-delimiting integer codes](TODO: linky) for 3 (`110`) and 7 (`101110`) to indicate that the 3rd and 7th bits are actually `1`).
  
  
-According to our definitions, yes. But if you saw this person on the street, you wouldn't want to call them a woman, because everything about them that you can observe looks like that of an adult human male. If you're not a reproductive health lab tech and don't look at the photographs in biology textbooks, you'll never _see_ the gametes someone produces. (You can see male semen, but the individual spermatozoa are too small to look at without a microscope; people [didn't even know that ova and sperm _existed_ until the 17th century](https://onlinelibrary.wiley.com/doi/full/10.1111/j.1439-0531.2012.02105.x).) Does that mean our common definition of _female_ isn't perfectly serviceable after all?
+Suppose that, even among the very few exceptions that aren't all-zeros or all-ones, the first bit is _always_ in the majority and is never "flipped": you can have exceptions that "look like" `00000100000000000000` or `11011111111101111011`, but never `10000000000000000000` or `01111111111111111111`.
  
  
-No, because humans whose gametes produce eggs but appear male in every other aspect, don't exist in the real world. What this really shows is that the cognitive technology of "words" having "definitions" doesn't work in _the world of the imagination_, because _the world of the imagination_ encompasses (at a minimum) _all possible configurations of matter_. Words are [short messages that compress a lot of information](https://www.lesswrong.com/posts/mB95aqTSJLNR9YyjH/message-length), but what it _means_ for the world to contain information is that some things in the world are vastly more probable than others.
+Then if you wanted an efficient encoding to talk about the two and only two _clusters_ of bitstrings—the mostly-zeros (a majority of `00000000000000000000` plus a few exceptions with a few bits flipped) and the mostly-ones (a majority of `11111111111111111111` plus a few exceptions with a few bits flipped)—you might want to use the first bit as the "definition" for your codewords—even if most of the various [probabilistic inferences that you wanted to make](https://www.lesswrong.com/posts/3nxs2WYDGzJbzcLMp/words-as-hidden-inferences) [on the basis of cluster-membership](https://www.lesswrong.com/posts/gDWvLicHhcMfGmwaK/conditional-independence-and-naive-bayes) concerned bits other than the first. The majoritarian first bit, even if you don't care about it in itself, is a [_simple_ membership test](https://www.lesswrong.com/posts/edEXi4SpkXfvaX42j/schelling-categories-and-simple-membership-tests) for the mostly-zeros/mostly-ones category system. 
  
  
-To see why, let's work in a restricted setting: the world of length-20 strings of bits. Suppose you wanted to devise an efficient _code_ to represent elements of this world with _shorter_ strings, such that you could say you saw a `01100` (in code, using just 5 bits) and the people listening to you would know that what you actually saw in the world was 
+Unfortunately—_deeply_ unfortunately—this is not a math blog. (I _wish_ this were a math blog—I wish I lived in a world where I could do math blogging for the greater glory of our collective understanding of reality, [rather than being condemned](TODO: linky "A Previous Life's War") to gender blogging in self-defense, hopelessly outgunned, outmanned, outnumbered, outplanned [in a Total Culture War](/2020/Feb/if-in-some-smothering-dreams-you-too-could-pace/) over the future of [my neurotype-demographic](/2021/May/sexual-dimorphism-in-the-sequences-in-relation-to-my-gender-problems/).) So, having briefly explained the theory, let's get back to the, how do you say, _application_.
  
  
+Defining sex in terms of gamete size or genitals or chromosomes is like the using the never-flipped first bit in our abstract example about the world of length-20 bitstrings. It's not that people _directly_ care about gametes or chromosomes or even gentials in most everyday situations. (You're probably not directly trying to mate with most of the people you meet in everyday situations, and sex chromosomes weren't discovered until the _20th_ century.)
  
  
+It's that that these are _discrete_ features that are entangled with everything _else_ that differs between females and males, including many [correlated](https://www.lesswrong.com/posts/cu7YY7WdgJBs3DpmJ/the-univariate-fallacy-1) statistical differences of various [effect sizes](/2019/Sep/does-general-intelligence-deflate-standardized-effect-sizes-of-cognitive-sex-differences/), and differences that are harder to articulate or measure, and differences that haven't even been discovered yet (as gametes and chromosomes hadn't respectively been discovered yet in the 16th and 20th centuries) but can be theorized to exist because _sex_ is a very robust abstraction that you need in order to understand the design of evolved biological creatures.
  
  
+Discrete features make for better word _definitions_ than high-dimensional statistical regularities, even if most of the everyday inferential utility of _using_ the word comes from the high-dimensional statistical stuff. A dictionary definition is just a helpful pointer to help people pick out "the same" concept in their _own_ world-model: in teaching a young child about sex (or "gender"), you only have to say "boys and men are the ones with a penis, examples include your Dad and Uncle Frank, non-examples include your Mom and Grandma Mary" and the child's brain's pattern-matching faculties will soak up the rest, [...]
  
  
+(Gamete size is a particularly good definition for the natural category of _sex_ because the concept of [anisogamy](https://en.wikipedia.org/wiki/Anisogamy) generalizes across species that have different sex determination systems or configurations of sexual anatomy. In birds, the presence or absence of a _Z_ chromosome determines whether an animal is _female_, in contrast the _Y_ chromosome's determination of maleness in mammals, and reptiles' sex is determined by the temperature of an lain egg while it develops (!). And let's not get started on the cloaca. [TODO: verify all])
  
  
  
  
-Outline—
+------
  
  (Let's [not play dumb about the significance of intersex conditions](https://colinwright.substack.com/p/sex-chromosome-variants-are-not-their) today.)
  
  
  (Let's [not play dumb about the significance of intersex conditions](https://colinwright.substack.com/p/sex-chromosome-variants-are-not-their) today.)