Research Methods

Comparison in Scientific Research: _{Uncovering statistically significant relationships}

by Anthony Carpi, Ph.D., Anne E. Egger, Ph.D.

Listen to this reading

00:00

Did you know?

Did you know that when Europeans first saw chimpanzees, they thought the animals were hairy, adult humans with stunted growth? A study of chimpanzees paved the way for comparison to be recognized as an important research method. Later, Charles Darwin and others used this comparative research method in the development of the theory of evolution.

Key concepts

Comparison is used to determine and quantify relationships between two or more variables by observing different groups that either by choice or circumstance are exposed to different treatments.
Comparison includes both retrospective studies that look at events that have already occurred, and prospective studies, that examine variables from the present forward.
Comparative research is similar to experimentation in that it involves comparing a treatment group to a control, but it differs in that the treatment is observed rather than being consciously imposed due to ethical concerns, or because it is not possible, such as in a retrospective study.

Terms you should know

Anyone who has stared at a chimpanzee in a zoo (Figure 1) has probably wondered about the animal's similarity to humans. Chimps make facial expressions that resemble humans, use their hands in much the same way we do, are adept at using different objects as tools, and even laugh when they are tickled. It may not be surprising to learn then that when the first captured chimpanzees were brought to Europe in the 17^th century, people were confused, labeling the animals "pygmies" and speculating that they were stunted versions of "full-grown" humans. A London physician named Edward Tyson obtained a "pygmie" that had died of an infection shortly after arriving in London, and began a systematic study of the animal that cataloged the differences between chimpanzees and humans, thus helping to establish comparative research as a scientific method.

Figure 1: A chimpanzee — **Figure 1:** A chimpanzee
image ©Corel Corporation

A brief history of comparative methods

In 1698, Tyson, a member of the Royal Society of London, began a detailed dissection of the "pygmie" he had obtained and published his findings in the 1699 work: Orang-Outang, sive Homo Sylvestris: or, the Anatomy of a Pygmie Compared with that of a Monkey, an Ape, and a Man. The title of the work further reflects the misconception that existed at the time – Tyson did not use the term Orang-Outang in its modern sense to refer to the orangutan; he used it in its literal translation from the Malay language as "man of the woods," as that is how the chimps were viewed.

Tyson took great care in his dissection. He precisely measured and compared a number of anatomical variables such as brain size of the "pygmie," ape, and human. He recorded his measurements of the "pygmie," even down to the direction in which the animal's hair grew: "The tendency of the Hair of all of the Body was downwards; but only from the Wrists to the Elbows 'twas upwards" (Russell, 1967). Aided by William Cowper, Tyson made drawings of various anatomical structures, taking great care to accurately depict the dimensions of these structures so that they could be compared to those in humans (Figure 2). His systematic comparative study of the dimensions of anatomical structures in the chimp, ape, and human led him to state:

in the Organization of abundance of its Parts, it more approached to the Structure of the same in Men: But where it differs from a Man, there it resembles plainly the Common Ape, more than any other Animal. (Russell, 1967)

Tyson's comparative studies proved exceptionally accurate and his research was used by others, including Thomas Henry Huxley in Evidence as to Man's Place in Nature (1863) and Charles Darwin in The Descent of Man (1871).

Figure 2: Edward Tyson's drawing of the external appearance of a — **Figure 2:** Edward Tyson's drawing of the external appearance of a "pygmie" (left) and the animal's skeleton (right) from *The Anatomy of a Pygmie Compared with that of a Monkey, an Ape, and a Man* from the second edition, London, printed for T. Osborne, 1751.

Tyson's methodical and scientific approach to anatomical dissection contributed to the development of evolutionary theory and helped establish the field of comparative anatomy. Further, Tyson's work helps to highlight the importance of comparison as a scientific research method.

Comparison as a scientific research method

Comparative research represents one approach in the spectrum of scientific research methods and in some ways is a hybrid of other methods, drawing on aspects of both experimental science (see our Experimentation in Science module) and descriptive research (see our Description in Science module). Similar to experimentation, comparison seeks to decipher the relationship between two or more variables by documenting observed differences and similarities between two or more subjects or groups. In contrast to experimentation, the comparative researcher does not subject one of those groups to a treatment, but rather observes a group that either by choice or circumstance has been subject to a treatment. Thus comparison involves observation in a more "natural" setting, not subject to experimental confines, and in this way evokes similarities with description.

Importantly, the simple comparison of two variables or objects is not comparative research. Tyson's work would not have been considered scientific research if he had simply noted that "pygmies" looked like humans without measuring bone lengths and hair growth patterns. Instead, comparative research involves the systematic cataloging of the nature and/or behavior of two or more variables, and the quantification of the relationship between them.

Figure 3: Skeleton of the juvenile chimpanzee dissected by Edward Tyson, currently displayed at the Natural History Museum, London. — **Figure 3:** Skeleton of the juvenile chimpanzee dissected by Edward Tyson, currently displayed at the Natural History Museum, London.
image ©Peter Kaminski

While the choice of which research method to use is a personal decision based in part on the training of the researchers conducting the study, there are a number of scenarios in which comparative research would likely be the primary choice.

The first scenario is one in which the scientist is not trying to measure a response to change, but rather he or she may be trying to understand the similarities and differences between two subjects. For example, Tyson was not observing a change in his "pygmie" in response to an experimental treatment. Instead, his research was a comparison of the unknown "pygmie" to humans and apes in order to determine the relationship between them.
A second scenario in which comparative studies are common is when the physical scale or timeline of a question may prevent experimentation. For example, in the field of paleoclimatology, researchers have compared cores taken from sediments deposited millions of years ago in the world's oceans to see if the sedimentary composition is similar across all oceans or differs according to geographic location. Because the sediments in these cores were deposited millions of years ago, it would be impossible to obtain these results through the experimental method. Research designed to look at past events such as sediment cores deposited millions of years ago is referred to as retrospective research.
A third common comparative scenario is when the ethical implications of an experimental treatment preclude an experimental design. Researchers who study the toxicity of environmental pollutants or the spread of disease in humans are precluded from purposefully exposing a group of individuals to the toxin or disease for ethical reasons. In these situations, researchers would set up a comparative study by identifying individuals who have been accidentally exposed to the pollutant or disease and comparing their symptoms to those of a control group of people who were not exposed. Research designed to look at events from the present into the future, such as a study looking at the development of symptoms in individuals exposed to a pollutant, is referred to as prospective research.

Comparative science was significantly strengthened in the late 19th and early 20th century with the introduction of modern statistical methods. These were used to quantify the association between variables (see our Statistics in Science module). Today, statistical methods are critical for quantifying the nature of relationships examined in many comparative studies. The outcome of comparative research is often presented in one of the following ways: as a probability, as a statement of statistical significance, or as a declaration of risk. For example, in 2007 Kristensen and Bjerkedal showed that there is a statistically significant relationship (at the 95% confidence level) between birth order and IQ by comparing test scores of first-born children to those of their younger siblings (Kristensen & Bjerkedal, 2007). And numerous studies have contributed to the determination that the risk of developing lung cancer is 30 times greater in smokers than in nonsmokers (NCI, 1997).

Comparison in practice: The case of cigarettes

In 1919, Dr. George Dock, chairman of the Department of Medicine at Barnes Hospital in St. Louis, asked all of the third- and fourth-year medical students at the teaching hospital to observe an autopsy of a man with a disease so rare, he claimed, that most of the students would likely never see another case of it in their careers. With the medical students gathered around, the physicians conducting the autopsy observed that the patient's lungs were speckled with large dark masses of cells that had caused extensive damage to the lung tissue and had forced the airways to close and collapse. Dr. Alton Ochsner, one of the students who observed the autopsy, would write years later that "I did not see another case until 1936, seventeen years later, when in a period of six months, I saw nine patients with cancer of the lung. – All the afflicted patients were men who smoked heavily and had smoked since World War I" (Meyer, 1992).

Figure 4: Image from a stereoptic card showing a woman smoking a cigarette circa 1900 — **Figure 4:** Image from a stereoptic card showing a woman smoking a cigarette circa 1900

The American physician Dr. Isaac Adler was, in fact, the first scientist to propose a link between cigarette smoking and lung cancer in 1912, based on his observation that lung cancer patients often reported that they were smokers. Adler's observations, however, were anecdotal, and provided no scientific evidence toward demonstrating a relationship. The German epidemiologist Franz Müller is credited with the first case-control study of smoking and lung cancer in the 1930s. Müller sent a survey to the relatives of individuals who had died of cancer, and asked them about the smoking habits of the deceased. Based on the responses he received, Müller reported a higher incidence of lung cancer among heavy smokers compared to light smokers. However, the study had a number of problems. First, it relied on the memory of relatives of deceased individuals rather than first-hand observations, and second, no statistical association was made. Soon after this, the tobacco industry began to sponsor research with the biased goal of repudiating negative health claims against cigarettes (see our Scientific Institutions and Societies module for more information on sponsored research).

Beginning in the 1950s, several well-controlled comparative studies were initiated. In 1950, Ernest Wynder and Evarts Graham published a retrospective study comparing the smoking habits of 605 hospital patients with lung cancer to 780 hospital patients with other diseases (Wynder & Graham, 1950). Their study showed that 1.3% of lung cancer patients were nonsmokers while 14.6% of patients with other diseases were nonsmokers. In addition, 51.2% of lung cancer patients were "excessive" smokers while only 19.1% of other patients were excessive smokers. Both of these comparisons proved to be statistically significant differences. The statisticians who analyzed the data concluded:

when the nonsmokers and the total of the high smoking classes of patients with lung cancer are compared with patients who have other diseases, we can reject the null hypothesis that smoking has no effect on the induction of cancer of the lungs.

Wynder and Graham also suggested that there might be a lag of ten years or more between the period of smoking in an individual and the onset of clinical symptoms of cancer. This would present a major challenge to researchers since any study that investigated the relationship between smoking and lung cancer in a prospective fashion would have to last many years.

Richard Doll and Austin Hill published a similar comparative study in 1950 in which they showed that there was a statistically higher incidence of smoking among lung cancer patients compared to patients with other diseases (Doll & Hill, 1950). In their discussion, Doll and Hill raise an interesting point regarding comparative research methods by saying,

This is not necessarily to state that smoking causes carcinoma of the lung. The association would occur if carcinoma of the lung caused people to smoke or if both attributes were end-effects of a common cause.

They go on to assert that because the habit of smoking was seen to develop before the onset of lung cancer, the argument that lung cancer leads to smoking can be rejected. They therefore conclude, "that smoking is a factor, and an important factor, in the production of carcinoma of the lung."

Despite this substantial evidence, both the tobacco industry and unbiased scientists raised objections, claiming that the retrospective research on smoking was "limited, inconclusive, and controversial." The industry stated that the studies published did not demonstrate cause and effect, but rather a spurious association between two variables. Dr. Wilhelm Hueper of the National Cancer Institute, a scientist with a long history of research into occupational causes of cancers, argued that the emphasis on cigarettes as the only cause of lung cancer would compromise research support for other causes of lung cancer. Ronald Fisher, a renowned statistician, also was opposed to the conclusions of Doll and others, purportedly because they promoted a "puritanical" view of smoking.

The tobacco industry mounted an extensive campaign of misinformation, sponsoring and then citing research that showed that smoking did not cause "cardiac pain" as a distraction from the studies that were being published regarding cigarettes and lung cancer. The industry also highlighted studies that showed that individuals who quit smoking suffered from mild depression, and they pointed to the fact that even some doctors themselves smoked cigarettes as evidence that cigarettes were not harmful (Figure 5).

Figure 5: Cigarette advertisement circa 1946. — **Figure 5:** Cigarette advertisement circa 1946.

While the scientific research began to impact health officials and some legislators, the industry's ad campaign was effective. The US Federal Trade Commission banned tobacco companies from making health claims about their products in 1955. However, more significant regulation was averted. An editorial that appeared in the New York Times in 1963 summed up the national sentiment when it stated that the tobacco industry made a "valid point," and the public should refrain from making a decision regarding cigarettes until further reports were issued by the US Surgeon General.

In 1951, Doll and Hill enrolled 40,000 British physicians in a prospective comparative study to examine the association between smoking and the development of lung cancer. In contrast to the retrospective studies that followed patients with lung cancer back in time, the prospective study was designed to follow the group forward in time. In 1952, Drs. E. Cuyler Hammond and Daniel Horn enrolled 187,783 white males in the United States in a similar prospective study. And in 1959, the American Cancer Society (ACS) began the first of two large-scale prospective studies of the association between smoking and the development of lung cancer. The first ACS study, named Cancer Prevention Study I, enrolled more than 1 million individuals and tracked their health, smoking and other lifestyle habits, development of diseases, cause of death, and life expectancy for almost 13 years (Garfinkel, 1985).

All of the studies demonstrated that smokers are at a higher risk of developing and dying from lung cancer than nonsmokers. The ACS study further showed that smokers have elevated rates of other pulmonary diseases, coronary artery disease, stroke, and cardiovascular problems. The two ACS Cancer Prevention Studies would eventually show that 52% of deaths among smokers enrolled in the studies were attributed to cigarettes.

In the second half of the 20^th century, evidence from other scientific research methods would contribute multiple lines of evidence to the conclusion that cigarette smoke is a major cause of lung cancer:

Descriptive studies of the pathology of lungs of deceased smokers would demonstrate that smoking causes significant physiological damage to the lungs.

Experiments that exposed mice, rats, and other laboratory animals to cigarette smoke showed that it caused cancer in these animals (see our Experimentation in Science module for more information).

Physiological models would help demonstrate the mechanism by which cigarette smoke causes cancer.

As evidence linking cigarette smoke to lung cancer and other diseases accumulated, the public, the legal community, and regulators slowly responded. In 1957, the US Surgeon General first acknowledged an association between smoking and lung cancer when a report was issued stating, "It is clear that there is an increasing and consistent body of evidence that excessive cigarette smoking is one of the causative factors in lung cancer." In 1965, over objections by the tobacco industry and the American Medical Association, which had just accepted a $10 million grant from the tobacco companies, the US Congress passed the Federal Cigarette Labeling and Advertising Act, which required that cigarette packs carry the warning: "Caution: Cigarette Smoking May Be Hazardous to Your Health." In 1967, the US Surgeon General issued a second report stating that cigarette smoking is the principal cause of lung cancer in the United States. While the tobacco companies found legal means to protect themselves for decades following this, in 1996, Brown and Williamson Tobacco Company was ordered to pay $750,000 in a tobacco liability lawsuit; it became the first liability award paid to an individual by a tobacco company.

Comparison across disciplines

Comparative studies are used in a host of scientific disciplines, from anthropology to archaeology, comparative biology, epidemiology, psychology, and even forensic science. DNA fingerprinting, a technique used to incriminate or exonerate a suspect using biological evidence, is based on comparative science. In DNA fingerprinting, segments of DNA are isolated from a suspect and from biological evidence such as blood, semen, or other tissue left at a crime scene. Up to 20 different segments of DNA are compared between that of the suspect and the DNA found at the crime scene. If all of the segments match, the investigator can calculate the statistical probability that the DNA came from the suspect as opposed to someone else. Thus DNA matches are described in terms of a "1 in 1 million" or "1 in 1 billion" chance of error.

Comparative methods are also commonly used in studies involving humans due to the ethical limits of experimental treatment. For example, in 2007, Petter Kristensen and Tor Bjerkedal published a study in which they compared the IQ of over 250,000 male Norwegians in the military (Kristensen & Bjerkedal, 2007). The researchers found a significant relationship between birth order and IQ, where the average IQ of first-born male children was approximately three points higher than the average IQ of the second-born male in the same family. The researchers further showed that this relationship was correlated with social rather than biological factors, as second-born males who grew up in families in which the first-born child died had average IQs similar to other first-born children. One might imagine a scenario in which this type of study could be carried out experimentally, for example, purposefully removing first-born male children from certain families, but the ethics of such an experiment preclude it from ever being conducted.

Limitations of comparative methods

One of the primary limitations of comparative methods is the control of other variables that might influence a study. For example, as pointed out by Doll and Hill in 1950, the association between smoking and cancer deaths could have meant that: a) smoking caused lung cancer, b) lung cancer caused individuals to take up smoking, or c) a third unknown variable caused lung cancer AND caused individuals to smoke (Doll & Hill, 1950). As a result, comparative researchers often go to great lengths to choose two different study groups that are similar in almost all respects except for the treatment in question. In fact, many comparative studies in humans are carried out on identical twins for this exact reason. For example, in the field of tobacco research, dozens of comparative twin studies have been used to examine everything from the health effects of cigarette smoke to the genetic basis of addiction.

Comparison in modern practice

Figure 6: The — **Figure 6:** The "Keeling curve," a long-term record of atmospheric CO₂ concentration measured at the Mauna Loa Observatory (Keeling et al.). Although the annual oscillations represent natural, seasonal variations, the long-term increase means that concentrations are higher than they have been in 400,000 years. Graphic courtesy of NASA's Earth Observatory.

Despite the lessons learned during the debate that ensued over the possible effects of cigarette smoke, misconceptions still surround comparative science. For example, in the late 1950s, Charles Keeling, an oceanographer at the Scripps Institute of Oceanography, began to publish data he had gathered from a long-term descriptive study of atmospheric carbon dioxide (CO₂) levels at the Mauna Loa observatory in Hawaii (Keeling, 1958). Keeling observed that atmospheric CO₂ levels were increasing at a rapid rate (Figure 6). He and other researchers began to suspect that rising CO₂ levels were associated with increasing global mean temperatures, and several comparative studies have since correlated rising CO₂ levels with rising global temperature (Keeling, 1970). Together with research from modeling studies (see our Modeling in Scientific Research module), this research has provided evidence for an association between global climate change and the burning of fossil fuels (which emits CO₂).

Yet in a move reminiscent of the fight launched by the tobacco companies, the oil and fossil fuel industry launched a major public relations campaign against climate change research. As late as 1989, scientists funded by the oil industry were producing reports that called the research on climate change "noisy junk science" (Roberts, 1989). As with the tobacco issue, challenges to early comparative studies tried to paint the method as less reliable than experimental methods. But the challenges actually strengthened the science by prompting more researchers to launch investigations, thus providing multiple lines of evidence supporting an association between atmospheric CO₂ concentrations and climate change. As a result, the culmination of multiple lines of scientific evidence prompted the Intergovernmental Panel on Climate Change organized by the United Nations to issue a report stating that "Warming of the climate system is unequivocal," and "Carbon dioxide is the most important anthropogenic greenhouse gas (IPCC, 2007)."

Comparative studies are a critical part of the spectrum of research methods currently used in science. They allow scientists to apply a treatment-control design in settings that preclude experimentation, and they can provide invaluable information about the relationships between variables. The intense scrutiny that comparison has undergone in the public arena due to cases involving cigarettes and climate change has actually strengthened the method by clarifying its role in science and emphasizing the reliability of data obtained from these studies.

Table of Contents

Highlight Glossary Terms

Activate glossary term highlighting to easily identify key terms within the module. Once highlighted, you can click on these terms to view their definitions.

Show NGSS Annotations

Activate NGSS annotations to easily identify NGSS standards within the module. Once highlighted, you can click on them to view these standards.

Comparison in Scientific Research: _{Uncovering statistically significant relationships}

A brief history of comparative methods

Comparison as a scientific research method

Comprehension Checkpoint

Comparison in practice: The case of cigarettes

Comprehension Checkpoint

Comparison across disciplines

Limitations of comparative methods

Comprehension Checkpoint

Comparison in modern practice

Modeling in Scientific Research: Simplifying a system to make predictions

Experimentation in Scientific Research: Variables and controls in practice

Description in Scientific Research: Observations and multiple working hypotheses