2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003

Curveballs: The Fuzzy Math Behind Sex Discrimination in the Sciences

As the evidence for sex discrimination in the sciences mounts, media pundits continue to cite math test scores for innate differences between women and men. Here’s why the numbers don’t add up

If you are going to be a provocateur, and your bully pulpit happens to be a forum of academics, a certain grasp of the facts is advisable – especially if you are bent on provoking thought outside your own discipline. This was not merely lost on Larry Summers, the former President of Harvard University, when he ventured to expound on why women were under-represented in math and science departments, and suggested, among other reasons, that women were innately compromised in this kind of cognitive functioning; it was often overlooked by those of the punditocracy, who rallied to his defense in the name of academic freedom.

“What is it about the word ‘provoke'’ those Harvard intellectuals don't understand?” asked the editorial page of the Boston Herald. “The transcript of Harvard University president Larry Summers' now infamous remarks about a female's innate scientific capabilities proves he was doing just what he said he was doing, provoking discussion.”

If anything should have renewed this discussion – and perhaps drawn it to a conclusion – it was the recent publication of a report by the National Academy of Sciences announcing that innate intelligence had nothing to do with the gender disparities in science and engineering. Rather, bias, discrimination and “outmoded institutional structures” were responsible for holding women back. Adding to the smack-down, Inside Higher Education released details from an as-yet-unpublished survey of 1,500 academics which found that only one-percent believed differing ability was a cause of the gender gap.

But apart from an intemperate column in the New York Times by John Tierney, who claimed that the NAS report was a “cynical” act of political correctness, the news passed without the kind of follow-up discussion that demonstrates some measure of a lesson learned. This was remarkable, given that Harvard law professor Alan Dershowitz had described the critical drubbing meted out to Summers as sounding “like the trial of Galileo."

The problem was that unlike Galileo versus the Catholic Church, Summers provoked a debate in which his academic interlocutors were, if not smarter in the average, then smarter on the particulars of this issue. And so when the pundits thundered about academic freedom being imperiled after Summers was driven to apologize for his comments, it was a distinctly dumbed-down, esteem-raising vision of academic freedom that was being advanced: that of the amateur to expound without getting a slap down from an expert.

“Forgive Larry Summers. He did not know where he was," jabbed George Will in the Washington Post. “He thought he was speaking in a place that encourages uncircumscribed intellectual explorations. He was not. He was on a university campus…”

Of course, given that the role of the pundit in the American media is to expound authoritatively and passionately on topics he or she may only have a passing acquaintance with, one might say that sympathy for Summers was a matter of self-interest. But it was difficult not to conclude that there was a broader political agenda, and that conservatives were itching to let rip at the supposedly underlying feminist and egalitarian orthodoxies because they were able to sound high-minded, and even “scientific” in doing so. As Will explained:

“Men and women have genetically based physical differences; the brain is a physical thing -- part of the body. Is it unthinkable -- is it even counterintuitive -- that this might help explain, for example, the familiar fact that more men than women achieve the very highest scores in mathematics aptitude tests? There is a vast and growing scientific literature on possible gender differences in cognition. Only hysterics denounce interest in those possible differences…”

Or perhaps only hysterical men refuse to accept that cognitive abilities have little to do with the gender disparity in the sciences?

Those who defended Summers seemed to have some powerful data on their side, pointing to the SAT scores to explain why men perform both better and worse than women in math: The bell curve was shallower for men – there were more dunces at one tail of the curve and geniuses at the other - than there were in the curve for women. In other words, fewer women than men were as dumb or as brilliant at math: men are more varied in performance on these tests, while women tend to clump in the “middle.” Summers referred to this phenomenon as "the availability of aptitude at the high end."

Assuming that the SAT scores in math are a proxy for the kind of intelligence required to be a world-class researcher in math and physics, then men will dominate that part of the bell curve 3.5 to 4 standard deviations from the mean. Considering that 99 percent of the population falls within three standard deviations Summers was referring to a very, very small number of people, which in the case of Harvard, roughly corresponds to those considered for jobs as professors of physics.

It wasn’t just conservatives who thought that this was a really good argument – and that, those who denounced Summers were guilty of putting gender politics ahead of dispassionate, intelligent inquiry. As William Saletan wrote in Slate just after Summer’s speech,

"It's a claim that the distribution of male scores is more spread out than the distribution of female scores—a greater percentage at both the bottom and the top. Nobody bats an eye at the overrepresentation of men in prison. But suggest that the excess might go both ways, and you're a pig."

And Tierney, the Times’ “libertarian” columnist, in castigating the NAS for indulging in “the kind of science that you expect to find in The Onion,” explained,

“One well-documented difference is the disproportionately large number of boys scoring in the top percentile of the SAT math test. And when you compare boy math whizzes with girl math whizzes, more differences appear. The boys score much higher on the math portion of the SAT than on the verbal, whereas the girls are more balanced -- high on the verbal as well as the math.”

What these critics missed is that there are real problems with using SAT scores (or other test scores) as a proxy for mathematical ability and intelligence at the highest level. As with all tests, the SAT measures exactly what it tests: the ability to quickly solve specific problems correctly on a high-pressure, timed exam.

Invariably, there are students who are poor test-takers but good “thinkers;” but more to the point, while the resulting scores may indicate “achievement” or “mastery” of a certain skill set, they cannot distinguish those who are truly brilliant from those who are just “very good” at the skill set.

There are also an assortment of extremely important skills involved with success as an academic that are not measured on these tests at all – skills like perseverance, patience, time commitment, interest, ability to work with others, ability to manage many projects together, ability to express ideas to others, ability to bridge different topics and make connections between different fields, and so forth.

When scientists are asked to list the “very best” scientists in their field, reputation derives not from the ability to perform basic computations quickly, but rather from the ability to generate deep ideas that have a profound impact on science. This simply cannot be measured with test scores.

Even when you take specific activities like doing theoretical physics, there is no single way of lining people up and saying Person A is more intelligent than Person B. Even though we make judgments and can reach consensus – and are occasionally able to say that one person strictly holds more, or better, information than another (say a calculus teacher over a student of calculus) – there's no way to distinguish objectively between top researchers by creating a curve with just one input variable, namely test scores. Intelligence is multi-dimensional.

Summers and his defenders also made the mistake of conflating the data with a model of the data. Scores on tests like the SAT tend to fall into a pattern, which is termed the “normal curve.” Typically, statisticians come up with such a model of the data to describe how test scores are distributed, and then use these models to explore certain aspects of the data. This happens to work very well when you talk in generalities; for example, one can calculate one standard deviation from the mean using properties of the normal curve, and find the approximate score values in which about 68 percent of the population will fall.

But this ideal relationship breaks down far away from the mean, where there is very little data (less than .004 percent of the population scores four standard deviations better than the mean). For tests on which there is a maximal score, the data cannot possibly fit exactly into the model of a “normal curve” because a normal curve has an infinite tail. In other words, even if the SAT was a good proxy for talent at math and physics, such talent would be poorly modeled by normal curves at the extreme ends.

Another mistake made by those who defended Summers’ theory is the conviction that these test scores are measuring something innate, when there is a lot of evidence to suggest otherwise. One of the most persuasive arguments is that the gap between the genders is diminishing; girls now score higher on these tests than they did twenty years ago, and even twenty years ago they did better than they had done fifty years ago. If the tests were measuring innate talent, we would not see significant differences from one generation to the next. Even if there weren’t problems with using test scores as a proxy for talent at the highest levels in science, the evidence suggesting that the measured differences are innate is, from an academic point of view, woefully weak. As the NAS noted, the gap between males and females at the very highest end of mathematical ability is narrowing.

Moreover, there is no such difference between test scores in Japan, while in Iceland women do better than men. If you reduce mathematical ability to gender differences, you have to find a plausible way of explaining away these highly inconvenient facts.

One of the logical requirements of scientific investigation is that you must abandon a theory when it fails to account for the facts or fit the data. But to the posse of Henry Higginses in the American media, it seems impossible to let go of the axiom that a woman cannot be more like a man in math and physics. Which leads us to ask, what is it about this class of graying men that just doesn’t get why you can’t graph away the gender disparity in the sciences with a curve?


Technorati icon View the Technorati Link Cosmos for this entry