+Ask the computer to assume that an individual's ancestry came from _K_ fictive ancestral populations where _K_ := 2, and it'll infer that sub-Saharan Africans are descended entirely from one, East Asians and some native Americans are descended entirely from the other, and everyone else is an admixture. But if you set _K_ := 3, populations from Europe and the near East (which were construed as admixtures in the _K_ := 2 model) split off as a new inferred population cluster. And so on.
+
+These ancestry groupings _are_ a "construct" in the sense that the groupings aren't "ordained by God"—the algorithm can find _K_ groupings for your choice of _K_—but _where_ it [draws those category boundaries](https://www.lesswrong.com/posts/esRZaPXSHgWzyB2NL/where-to-draw-the-boundaries) is a function of the data. The construct is doing _cognitive work_, concisely summarizing statistical regularities in the dataset (which is _too large_ for humans to hold in their heads all at once): a map that reflects a territory.
+
+Twentieth-century theorists like Fisher and Haldane and whatshisface-the-guinea-pig-guy had already figured out a lot about how evolution works (stuff like, a mutation that confers a fitness advantage of _s_ has a probability of about 2<em>s</em> of sweeping to fixation), but a lot of hypotheses about recent human evolution weren't easy to test or even formulate until the genome was sequenced!
+
+You might think that there wasn't enough _time_ in the 2–5k generations since we came forth out of Africa for much human evolution to take place: a new mutation needs to confer an unusually large benefit to sweep to fixation that fast. But what if you didn't actually need any new mutations? Natural selection on polygenic traits can also act on "standing variation": variation _already_ present in the population that was mostly neutral in previous environments, but is fitness-relevant to new selection pressures. The rapid response to selective breeding observed in domesticated plants and animals mostly doesn't depend on new mutations.
+
+Another mechanism of recent human evolution is _introgression_: early humans interbred with our Neanderthal and Denisovan "cousins", giving our lineage the chance to "steal" all their good alleles! In contrast to new mutations, which usually die out even when they're beneficial (that 2<em>s</em> rule again), alleles "flowing" from another population keep getting reintroduced, giving them more chances to sweep!
+
+Population differences are important when working with genome-wide association studies, because a model "trained on" one population won't perform as well against the "test set" of a different population. Suppose you do a big study and find a bunch of SNPs that correlate with a trait, like schizophrenia or liking opera. The frequencies of those SNPs for two populations from the same continent (like Japanese and Chinese) will hugely correlate (Pearson's _r_ ≈ 0.97), but for more genetically-distant populations from different continents, the correlation will still be big but not huge (like _r_ ≈ 0.8 or whatever).
+
+What do these differences in SNP frequencies mean in practice?? We ... don't know yet. At least some population differences are fairly well-understood: I'd tell you about sickle-cell and lactase persistence, except [then I would have to scream](/2017/Dec/interlude-xi/). There are some cases where we see populations independently evolve different adaptations that solve the same problem: [people living on the plateaus of both Tibet and Peru have both adapted to high altitudes](https://www.pnas.org/content/104/suppl_1/8655.long), but the Tibetans did it by breathing faster and the Peruvians did it with more hemoglobin!
+
+Sorry, "the Tibetans did it with ..." is sloppy phrasing on my part; what I actually mean is that the Tibetans who weren't genetically predisposed to breathe faster were more likely to die without leaving children behind. That's how evolution works!
+
+The third part of the book is about genetic influences on class structure! Untangling the true causes of human variation is a really hard technical philosophy problem, but behavioral geneticists have at least gotten started with their simple _ACE_ model. It works like this: first, assume (that is, "pretend") that the genetic variation for a trait is _additive_ (if you have the appropriate SNP, you get more of the trait), rather than exhibiting _epistasis_ (where the effects of different loci interfere with each other) or Mendelian _dominance_ (where the presence of just one copy of an allele (of two) determines the phenotype, and it doesn't matter whether you heterozygously have a different allele as your second version of that gene). Then we pretend that we can partition the variance in phenotypes as the sum of the "additive" genetic variance _A_, plus the environmental variance "common" within a family, plus "everything else" (including measurement "error" and the not-shared-within-families "environment") _E_. Briefly (albeit at the risk of being _cliché_): nature, nurture, and _noise_.
+
+Then we can estimate the sizes of the _A_, _C_, and _E_ components by studying fraternal and identical twins. (If you hear people talking about "twin studies", this is what they mean—_not_ case studies of identical twins raised apart, which _are_ really cool but don't happen very often.) Both kinds of twins have the same family environment _C_ at the same time (parents, socioeconomic status, schools, _&c._), but identical twins are twice as genetically related to each other as fraternal twins, so the extent to which the identical twins are more similar is going to pretty much be because of their genes. "Pretty much" in the sense that while there are ways in which the assumptions of the model aren't quite true (assortative mating makes fraternal twins more similar in the ways their parents were _already_ similar before mating, identical twins might get treated more similarly by "the environment" on account of their appearance), the _quantitative_ effect of these deviations are probably pretty small.
+
+Anyway, it turns out that the effect of the shared environment _C_ is way smaller than most people intuitively expect—next to zero for personality and adult intelligence. The environment matters—just not the part of the environment shared by sibling in the same family. Just not the part of the environment we know how to control. Thus, a lot of economic and class stratification actually ends up being along genetic lines: the nepotism of family wealth can buy opportunities and second chances, but it doesn't actually live your life for you.