The SAT has a significant advantage as a proxy IQ test over other standardized academic tests, such as the American College Testing (ACT), an alternative university admissions test, or the National Assessment of Educational Progress (NAEP), administered to representative samples of fourth and eighth graders in public schools every year. While the SAT measures the students' critical reasoning ability, both the ACT and the NAEP measure their learned knowledge of academic subjects. This distinction between the SAT and the ACT is well recognized by both testing services…. A principal component analysis of SAT and ACT scores shows that the former load on two factors (verbal and quantitative) while the latter load on four additional factors (information, English, natural sciences, and social studies). Frey and Detterman (2004) show that the correlation between SAT scores and g is .857 (corrected for nonlinearity) when the measure of g is the Armed Services Vocational Aptitude Battery, and it is .72 (corrected for restricted range) when the measure of g is Raven's Advanced Progressive Matrices.

This is not to deny the complicating nuances of the research. After all, a genome-wide association study of intelligence determined that the examined single nucleotide polymorphisms of our DNA influenced the fluid intelligence, which was partially derived from Raven’s Matrices, more than crystallized-type intelligence, which tests of acquired knowledge (like vocabulary) can measure. However, the mysterious Flynn effect of rising intelligence in the industrialized world has more rapidly elevated Raven’s Matrices scores than other intelligence tests. SAT data can construct score distribution graphs for racial groups but only for four years in the 1980’s. In the case of black and white students, the years in question still likely reflect the present situation because the rapid decline in the black-white score gap occurred just prior to these years, and these score differences have, more or less, persisted since then. Though the verbal and writing subtests might not elicit a Pavlovian reaction to bell curves, this seems to result from the test range chopping the black students’ curves into wedges. If the true IQ distribution of African Americans follows a bell-shaped Gaussian curve, then an artificial minimum SAT score could be misrepresenting the full ability spectrum of black students.

In 1996, SAT score distributions “recentered” to reflect a new 1990 reference group that replaced the old 1941 reference group. Prior to the recentering, the greater decline of average verbal scores relative to mathematics subtest scores had concerned the College Board. Recentering also lowered the mathematics standard deviation to make black, Hispanic, and female students “appear less below average.” The following graph shows that recentering increased verbal scores even more for the black students in the 1990 reference group, giving them a bell-shaped distribution. This does not convince me that the same occurred for actual post-recentering black SAT scores because the black-white gap remained virtually unchanged. Certainly, the SAT verbal and math subtest distribution for the general population shifted higher, as shown below: Notice that the percentages with the highest scores continued to increase even after the recentering, especially on the mathematics subtest.

Shifting all groups higher could hurt the black average verbal and writing SAT scores by revealing a full bell curve and thereby allowing the artificial floor to fall out from under the worst students, unless black performance improved simultaneously, causing the two phenomena to mask each other. However, if African Americans suddenly attained an extended bell-shaped distribution, I would expect an increase in their score variance on the verbal subtest, which would be reflected in an increased standard deviation. On the contrary, black students have long held the lowest standard deviations, and the graph of this quantity has been equally flat for math and verbal subtests. The following graphs show the black and white score distributions without the population sizes being held equal. At the time, African Americans were the largest minority, and similar graphs for Hispanics and Asians make the respective groups’ curves almost imperceptible puddles, so I shall forgo posting them. As the standard deviations graph above already revealed, Asians comprise the most heterogeneous group, and I find their distribution to be the most fascinating. The most obvious characteristic of the verbal and writing graphs are the bimodal distributions, which one would expect in a group for whom English frequently is the second language. This matches the writing subtest distribution for Hispanics below, but the Asian verbal subtest graph has one other aspect lacking in the Hispanic counterpart. Despite the large number of poor performers on the left side, the most elite performers of the Asian graph appear to present in roughly equal proportion to those of the white graph. In fact, a slightly higher proportion of Asians achieved the highest two verbal score ranges compared to the white group for each of the four years, and these were years prior to most of the Asian score improvement that I previously discussed.

On the mathematics subtest graph, the Asian distribution extends noticeably more into the higher ranges than the white distribution. Thus, a much greater proportion of Asians achieve the highest range of math performance, a point that I shall also extend to men. A 2006 no-confidence vote compelled Larry Summers to resign from his position as president of Harvard because he gave a speech in which he said the following:

There are three broad hypotheses … with respect to the presence of women in high-end scientific professions…. The second is what I would call different availability of aptitude at the high end…. It does appear that on many, many different human attributes—height, weight, propensity for criminality, overall IQ, mathematical ability, scientific ability—there is relatively clear evidence that whatever the difference in means—which can be debated—there is a difference in the standard deviation, and variability of a male and a female population…. Even small differences in the standard deviation will translate into very large differences in the available pool substantially out.

The standard deviations graph above validates Summers’ observation about differing aptitude variability between the sexes, and this is especially the case on the mathematics subtest. The following graphs illustrate just how much the standard deviation difference in math (plus a difference in mean) translates into substantially more male students in the highest aptitude levels. Dr. Summers, on behalf of Harvard University, I would like to offer you your job back.

Davies G, Tenesa A, Payton A, Yang J, Harris SE, Liewald D, Ke X, Le Hellard S, Christoforou A, Luciano M, McGhee K, Lopez L, Gow AJ, Corley J, Redmond P, Fox HC, Haggarty P, Whalley LJ, McNeill G, Goddard ME, Espeseth T, Lundervold AJ, Reinvang I, Pickles A, Steen VM, Ollier W, Porteous DJ, Horan M, Starr JM, Pendleton N, Visscher PM, & Deary IJ (2011). Genome-wide association studies establish that human intelligence is highly heritable and polygenic. Molecular psychiatry, 16 (10), 996-1005 PMID: 21826061

Hiscock, M. (2007). The Flynn effect and its relevance to neuropsychology Journal of Clinical and Experimental Neuropsychology, 29 (5), 514-529 DOI: 10.1080/13803390600813841

Kanazawa, S. (2006). IQ and the wealth of states Intelligence, 34 (6), 593-600 DOI: 10.1016/j.intell.2006.04.003