Statistics in research design

by Anthony Carpi, Ph.D., Anne E. Egger, Ph.D.

This material is excerpted from a teaching module on the Visionlearning website, to view this material in context, please visit Data: Statistics.

Many people misinterpret statements of likelihood and probability as a sign of weakness or uncertainty in scientific results. However, the use of statistical methods and probability tests in research is an important aspect of science that adds strength and certainty to scientific conclusions. For example, in 1843, John Bennet Lawes, an English entrepreneur, founded the Rothamsted Agriculture Experimental Station in Hertfordshire, England to investigate the impact of fertilizer application on crop yield. Lawes was motivated to do so because he had established one of the first artificial fertilizer factories a year earlier. For the next 80 years, researchers at the Station conducted experiments in which they applied fertilizers, planted different crops, kept track of the amount of rain that fell, and measured the size of the harvest at the end of each growing season. By the turn of the century, the Station had a vast collection of data but few useful conclusions: one fertilizer would outperform another one year but underperform the next, certain fertilizers appeared to affect only certain crops, and the differing amounts of rainfall that fell each year continually confounded the experiments (Salsburg, 2001). The data were essentially useless because there were a large number of uncontrolled variables.

Building at the Rothamsted Research Station
Figure 2: A building at the Rothamsted Research Station

In 1919, the Rothamsted Station hired a young statistician by the name of Ronald Aylmer Fisher to try to make some sense of the data. Fisher’s statistical analyses suggested that the relationship between rainfall and plant growth was far more statistically significant than the relationship between fertilizer type and plant growth. But the agricultural scientists at the station weren’t out to test for weather – they wanted to know which fertilizers were most effective for which crops. No one could remove weather as a variable in the experiments, but Fisher realized that its effects could essentially be separated out if the experiments were designed appropriately. In order to share his insights with the scientific community, he published two books: Statistical Methods for Research Workers in 1925 and The Design of Experiments in 1935. By highlighting the need to consider statistical analysis during the planning stages of research, Fisher revolutionized the practice of science and transformed the Rothamsted Station into a major center for research on statistics and agriculture, which it still is today.

In The Design of Experiments, Fisher introduced several concepts that have become hallmarks of good scientific research, including the use of controls, randomization, and replication (Figure 3).

Controls: The use of controls is based on the concept of variability. Since any phenomenon has some measure of variability, controls allow the researcher to measure natural, random, or systematic variability in a similar system and use that estimate as a baseline for comparison to the observed variable or phenomenon. At Rothamsted, a control would be a crop that did not receive the application of fertilizer (see plots labeled I in Figure 3). The variability inherent in plant growth would still produce plants of varying heights and sizes. The control then could provide a measure of the impact that weather or other variables could have on crop growth independent of fertilizer application, thus allowing the researchers to statistically remove this as a factor.

Randomization: Statistical randomization helps to manage bias in scientific research. Unlike the common use of the word random, which implies haphazard or disorganized, statistical randomization is a precise procedure in which units being observed are assigned to a treatment or control group in a manner that takes into account the potential influence of confounding variables. This allows the researcher to quantify the influence of these confounding variables by observing them in both the control and treatment groups. For example, before Fisher, fertilizers were applied along different crop rows at Rothamsted, some of which fell entirely along the edge of fields. Yet edges are known to affect agricultural yield, and so it was difficult in many cases to distinguish edge effects from fertilizer effects. Fisher introduced a process of randomly assigning different fertilizers to different plots within a field in a single year while assuring that not all of the treatment (or control) plots for any particular fertilizer fell along the edge of the field (see Figure 3).

Fisher's Barley Treatment Plot Design
Figure 3: An original figure from Fisher's The Design of Experiments showing the arrangement of treatment groups and yields of barley in an experiment at the Rothamsted station in 1927 (Fisher, 1935). Letters in parentheses denote control plots not treated with fertilizer (I) or those treated with different fertilizers (s = sulfate of ammonia, m = chloride of ammonia, c = cyanamide, and u = urea) with or without the addition of superphosphate (p). Subscripted numbers in parentheses indicate relative quantities of fertilizer used. Numbers at the bottom of each block indicate the relative yield of barley from the plot.

Replication: Fisher also advocated for replicating experimental trials and measurements such that the range of variability inherently associated with the experiment or measurement could be quantified and the robustness of the results could be evaluated. At Rothamsted this meant planting multiple plots with the same crop and applying the same fertilizer to each of those plots (see Figure 3). Further, this meant repeating similar applications in different years so that the variability of different fertilizer applications as a function of different weather conditions could be quantified. In general, scientists design research studies based on the nature of the question they are seeking to investigate, but they refine their research plan in line with many of Fisher’s statistical concepts to increase the likelihood that their findings will be useful. The incorporation of these techniques facilitates the analysis and interpretation of data, another place where statistics are used.

Anthony Carpi, Ph.D., Anne E. Egger, Ph.D. “Statistics in research design” Visionlearning Vol. HID (8), 2009.