Issues in Biology

Is Biology a Statistical Science?

Statistics plays a major role in biology and does so in two different ways. First, statistics is used to express relations that describe correlations between a cause and effect in a system. Good examples are population genetics, health risks, or enzyme kinetics. Causality, the knowledge of which causes which, however, is not the result of this numerical description. To understand cause and effect, we need to study the mechanism underlying the observed properties.

To answer the question if biology is a statistical science, we have to understand that it is a necessity to know about mechanisms. This brings us to a second aspect of the importance of statistics, namely the question if there are laws in biology that are statistical in nature. Science describes the state and behavior of systems. Living systems behave in ways that are thought to be different from inanimate matter. Erwin Schroedinger has outlined the implication of this difference by making a direct reference to the statistical nature of the laws of physics. The latter describe the behavior of a very large number of similar atoms (e.g. temperature of a gas, liquid), while biological systems are composed of small numbers of rather dissimilar atoms that often behave in a way that cannot be predicted in statistical ways. Still, biological systems are recognizable structures with non-random patterns and can be described by such statistical methods as kinetics and thermodynamics. At the molecular level, however, biological systems follow stochastic principles and the randomness of the events becomes a major part of their function.

To maintain orderliness at the molecular level where statistical behavior does not control function, we need to understand that biological systems  -  cells  -  never ever are at rest, or in the words of physics and chemistry, they are never at chemical equilibrium. On the other hand, both physics and chemistry are disciplines with strong foundations in statistics and describe systems at equilibrium, very simple systems at that. The equilibrium is a state of a system that follows statistical laws, with average and variation that are well characterized. Biological systems in contrast are complex, their function based on small numbers of units (atoms, molecules, cells) that are not at equilibrium. There are no biological laws of nature the way we find them in physics; the first and second law of thermodynamics, Heisenberg's uncertainty principle, Newton's law of gravity. Of course, biological systems don't violate these laws, but their composition and function defy a simple, statistical description.

It may thus come as a surprise that much of what is known in modern biology has been obtained through statistical analysis. Why? Because biological systems have been studied under simplified conditions, a reductionist approach, which is epistemologically powerful but has its obvious limits in explaining cellular mechanisms. To give an example, biochemical pathways like glycolysis have been experimentally determined by isolating and enriching the components of each individual step (10 steps in glycolysis) studying the chemical equilibrium in diluted solution with a very large number of molecules of both metabolites and enzymes. So, yes, modern biology is a statistical science. As a statistical science, it relies on large numbers, millions and billions of units in a population or molecular ensemble. The behavior that we can describe is a so called macroscopic property. The property of a very large number of the same molecules. This route of analysis allows as to explain how a single molecule in our ensemble behaves in average. But how relevant is the average? In a real cell, the actual number of molecules involved in glycolysis is very, very small and can be measured in the hundreds and operate under crowded conditions. Small changes in the absolute number of molecules can dramatically change the course and output of the pathway, can stimulate, inhibit or even reverse it.

To know the difference of the average behavior of a population of molecules and the behavior of a specific individual molecule is one of the most important distinction a scientist has to make when studying the mechanism of a biological system. Here, biology is suddenly no longer the statistical science described above. Here, we deal with small numbers, very small numbers, in fact, we often deal with single molecules, where the world becomes one of random events and small changes can effect to outcome of a reaction. This, of course, is reminiscent of our very own experience, where the quality of life is described in mortality rates and life expectancy, risk factors and insurance rates. What really happens in our lives, however, can be completely different from these expectations, in short, our personal future is unpredictable, while that of the population is fairly predictable.

With ever increasing sophistication of their tools, biologists are entering this realm of small numbers and complex interactions. It is a world of microscopic properties, where numbers and intensity of interactions among molecules are in constant flux. Biological systems are truly dynamic systems. Thanks to modern powerful computers, some of these processes can be simulated, because it is impossible to observe them for lack of sensitive instrumentation. And it is often the instabilities within the instruments themselves that limit the resolution of the observable signals.

 Home | Back to Issues
Copyright  © 2000-2005 Lukas K. Buehler