Visualizing Scientific Data: An essential component of research

Visionlearning has updated this module since its original publication. For more comprehensive information on visual aids and data, please go to Data: Using Graphs and Visual Data.

Since the mid-1600s, when Isaac Newton first made it a habit to take precise measurements of the phenomena he studied, data have been a fundamental component of any scientific endeavor. Scientists in different fields collect data in many different forms, from the magnitude and location of earthquakes to the length of finch beaks to the concentration of carbon dioxide in the atmosphere and so on. Although data are initially compiled in tables or databases, they are inevitably displayed in a graphic form to help scientists visualize and interpret the variation within the data. This graphic form may be a graph or pie chart, a map or cross-section, or an animation.

Figure 1: Excerpt of a data table of atmospheric CO2 measured at Mauna Loa. Data was obtained from the Carbon Dioxide Information Analysis Center. Click on the graphic to see the whole table.

Pure tables of numbers can be difficult to interpret. Imagine trying to identify any long-term trends in this data table of atmospheric carbon dioxide concentrations taken over several years at Mauna Loa (click on the table to the right to see all of the data).

It is difficult for most people to make sense of that much numerical information. If, however, we take the exact same data and plot it on a graph, here's what it looks like:

Figure 2: Atmospheric CO2 measured at Mauna Loa. This is a famous graph called the Keeling Curve (courtesy NASA).

The x-axis, or the horizontal axis, shows the variable of time in units of years, and the y-axis, or the vertical axis, shows the range of the variable of carbon dioxide (CO2) concentration in units of parts per million (ppm). Thus, the graph is showing us the change in atmospheric CO2 concentrations over time. The dark blue line shows average annual CO2 concentrations as listed in the right-hand column of the table. The light blue line represents all of the monthly numerical data from the table above. While a keen observer may have been able to pick out of the table the increasing average annual CO2 concentrations seen in the dark blue trend line, it would have been very difficult for even the most highly trained scientist to note the yearly cycling in atmospheric CO2 that the light blue trend line easily demonstrates.

Graphing data is just a first step. A more important role that graphs play is helping scientists to interpret their data – in other words, what do all of the numbers really mean? On the graph, it is easy to see that the concentration of atmospheric CO2 steadily rose over time, from a low of about 315 ppm in 1958 to a current level of about 373 ppm. Within that long-term trend, there are annual cycles of about 5 ppm. The next step in interpretation involves explaining why there is a long-term rise in atmospheric CO2 concentrations on top of an annual fluctuation – now we need to move beyond the graph itself and put it into context. In this case, the annual cycles of about 5 ppm CO2 concentrations are related to natural, seasonal changes. Most scientists agree, however, that the long-term increase is related to the growing number of human activities that release CO2, such as burning fossil fuels (see our Carbon Cycle module for more information on this topic).

We just followed a short procedure to extract a lot of information from this graph. Although an infinite variety of data can appear in graphical form, this same procedure can apply when reading any kind of graph:

  1. Describe the graph: What does the title say? What is on the x-axis? What is on the y-axis? What are the units?
  2. Describe the data: What is the numerical range of the data? What kinds of patterns can you see in the data?
  3. Interpret the data: How do the patterns you see in the graph relate to other things you know?

Other ways to visualize data

There are many different types of graphs, and they are used for different purposes. The CO2 graph shown above is a line graph, a very common type of graph often used to show change in one or more variables as it relates to a second variable such as time. Here are a few other examples:

Figure 3: Bar graphs, or histograms, are used to show a frequency distribution. They are used mainly for comparison. The variables are generally shown on the x-axis and the frequency on the y-axis.

We can go through the same three-step process to analyze this graph, as well. Step 1: The graph shows the ancestry of United States residents. The different ancestries are shown along the x-axis, while the number of people with a given ancestry is shown on the y-axis. Step 2: The most common ancestries are European: English, German, Irish and Italian, along with American. Step 3: There are several possible interpretations of this data. First of all, the fact that, when considered together, many more people declare European ancestry than American ancestry reveals that the United States is a young country, populated largely by immigrants. Secondly, the range of ancestries reflects the most common immigrants to this country.

Figure 4: Maps are often used to show spatially distributed data. This map shows the distribution of earthquakes around the world (in red), and it is easy to see that they are not distributed randomly.

Although maps are slightly different than graphs, they are still a graphic representation of lots and lots of data, and we can use the same technique to analyze them. On this world map, each red dot represents an earthquake. The earthquakes are not randomly distributed – they are clustered in lines and in certain areas. This observation was used by geologists to interpret the zones of high earthquake activity as plate boundaries, where two pieces of the earth's thin surface layer meet and rub against each other (see our Plate Tectonics II module for further explanation).

Pie graphs and three-dimensional graphs are yet more ways to visualize data:

Figure 5: A pie graph shows parts of a whole. In this case, the whole is the US adult population, and the parts are the percentages of adults who have completed various levels of education.

Figure 6: A three-dimensional plot can connect three variables together. In this case, the x- and y-axes are related to latitude and longitude, while the z-axis shows the concentration of mercury pollution in smaller zones within the area. Figure adapted from Opsomer, J.D., Agras, J., Carpi, A., Rodriques, G. (1995) An Application of Locally Weighted Regression to Airborne Mercury Deposition Around an Incinerator Site, Environmetrics, 6:205-219.

Many areas of study within science have more specialized graphs used for specific kinds of data. Evolutionary biologists, for example, use evolutionary trees to show how species change and evolve over time. Geologists use stereonets to show the orientation of rock layers in three-dimensional space. In all cases, however, the same rules of analysis apply: describe the graph, describe the data and then begin to interpret. The simple procedure described here should help you take full advantage of any graphs you come across within your study of science.

Regardless of the exact type of graph, the creation of clear, understandable visualizations of data is of fundamental importance in all branches of science. Likewise, reading and interpreting graphs is a key skill at all levels, from the introductory student to the research scientist. Graphs are a key component of scientific research papers where new data is routinely presented. Presenting the data from which conclusions are drawn allows other scientists the opportunity to analyze the data for themselves, a process whose purpose is to keep scientific experiments and analysis as objective as possible. Although tables are necessary to record the data, graphs allow readers to visualize complex datasets in a simple, concise manner.

Anne E. Egger, Ph.D. “Visualizing Scientific Data” Visionlearning Vol. SCI-2 (1), 2004.

Further Reading

Table of Contents

Activate glossary term highlighting to easily identify key terms within the module. Once highlighted, you can click on these terms to view their definitions.

Activate NGSS annotations to easily identify NGSS standards within the module. Once highlighted, you can click on them to view these standards.