Sampling bias in the Kurtz, Zelen, and Abell "sTARBABY" test of the astrological Mars effect

Review of “Results of the U.S. test of the ‘Mars effect’ are negative”

In response to French researcher Michel Gauquelin (1973) who claimed to have found support for the astrological properties of some of the planets, philosopher Paul Kurtz, later joined by statistician Marvin Zelen and astronomer George Abell, all members of the skeptical American CSICOP organization, conducted an experiment in 1977-78 that became known as the KZA or "sTARBABY" test. The purpose of the experiment was to falsify the “Mars effect,” a significant excess of Mars frequencies either rising or culminating (in the parts of the sky now called the Gauquelin sectors) at the births of champion athletes. Could the Mars effect be found in a fresh sample of American champions that the KZA researchers would gather for themselves?

The KZA test was famously criticized in the long article “sTARBABY” published in Fate magazine by KZA participant and astronomer Dennis Rawlins (1981). In the article, Rawlins reports on all manner of infighting and bad behavior by the KZA scientists, but he never actually tells what the fighting was about. Consequently, readers of the sTARBABY article have the impression that nothing of scientific impropriety happened during the course of the study. Despite the all the gory details and fault finding, it seems that Rawlins could simply not bring himself to expose the basis of it all.

Kurtz gathered birth data of famous athletes for the experiment in three canvases and the Mars effect diminished with each canvas. He gathered the first canvas in 1977 (N=128), the next canvas in spring/summer 1978 (N=198), and the last canvas in fall 1978 (N=82). Chance expectancy in the Gauquelin sectors is 17% and this was to be compared with the results for famous athletes. The first canvas, which had the highest proportion of top athletes (60.9% of whom had at least one citation in the sports dictionaries used), produced a frequency of 19.5%, which was better than chance. The second canvas (47.0% of whom had at least one citation) produced 12.1%. The third canvas (only 25.6% of whom had at least one citation) produced a mere 7.3%. The use of sequential canvases is unusual because it raises suspicions of sampling bias, and this practice of gathering data in separate canvases was not well explained by the KZA researchers.

Kurtz knew from Gauquelin that team athletes, especially basketball players, did not yield a strong Mars effect and the KZA test included an increasing proportion of basketball players with each canvas (from 22% to 40% to 71%). In the end, nearly a third of the sample of athletes was basketball players (N=129). The KZA results went from a better than chance Mars effect in the first canvas (19.5%) to a significant anti-Mars effect (13.5%) in the evaluation of the three canvases overall. Based on their finding of a statistically significant reverse Mars effect, which is quite shocking in itself, the KZA authors claimed the Mars effect to be false.

Now that the KZA method of testing has demonstrated how to successfully reverse the Mars effect into significantly negative territory, the challenge for science ought to be how to understand what it is about the character and skills of basketball players that sets them apart from other athletes. Perhaps, as astrologer Zip Dobyns (1981) has suggested, physical height might have been the primary variable for success in basketball more than athletic skill. Dobyns’ argument makes sense. Physical height was by far the definitive advantage in basketball during the sports era tested, before the game-changing development of the spectacularly athletic flying jump shots that everyone is now accustomed to. Height could have gone unnoticed as a selection bias and whether the KZA experimenters understood this or not, they recognized that basketball presented a tilt in the playing field and they took full advantage of it.

Almost all the athletes in the KZA study were born before 1950 and a fresh sample can be gathered. Future Mars athlete tests ought to incorporate the improvements in experimental sensitivity implemented by psychologist and statistician  Suitbert Ertel. As Ertel recommends, Mars frequency should be quantified by Gauquelin's more exacting 36 rather than 12 diurnal sector divisions. Also, to eliminate selection bias, the athletes should be strictly ranked according to the number of their citations, as a measurement of professional eminence, in the agreed upon sports dictionaries, as Ertel has done. Citation count is a high stakes objectifying variable that is more capable than traditional methods of evaluating whether there is a Mars-related eminence effect. Despite the obvious selection bias, Ertel's reassessment of the KZA data by using these improved methods, has demonstrated the presence of a significant Mars eminence effect. The frequency of Mars in Gauquelin sectors increases in proportion to the eminence of the athletes (Ertel and Irving, 1996).


Ertel, Suitbert and Kenneth Irving. (1996). The Tenacious Mars Effect. Urania Trust, ISBN 1871989159.

Ertel, Suitbert (1988). “Raising the Hurdle for the Athletes’ Mars Effect: Association Co-varies with Eminence”. Journal of Scientific Exploration, 2(1), 53-82.
Dobyns, Zip. (1981). “Starbaby: the Great Debunking Scandal,” Mutable Dilemma, (Virgo), Viewed on 2013-07-13.
Kurtz, Paul; Marvin Zelen; and George Abell (Winter 1979-1980). “Results of the U.S. Test of the ‘Mars Effect’ Are Negative,” The Skeptical Inquirer 4 (2), 19-26.
Lippard, Jim. (2011). “Skeptics and the ‘Mars Effect’: A Chronology of Events and Publications,” (June 5). Viewed 2013-07-13.
Gauquelin, Michel (1973). Cosmic Influences on Human Behavior. ASI Publishers, New York.

Rawlins, Dennis (1981). “sTARBABY,” Fate magazine, No. 34 (October), 67-98.

© 2013 Kenneth McRitchie