Tuesday, November 1, 2016

Is there really evidence for a limit to human lifespan?

Driven by technological progress, human life expectancy has increased greatly since the nineteenth century. A study recently published in Nature used demographic data to reveal a lifespan of ~115 years that human beings cannot exceed, simply by virtue of being human. However, when data scientists re-analyzed the data published in the Nature paper using model comparison, a different picture emerged: If anything, their data supports the idea that our lifespan is nowhere close to reaching a limit.


Evidence for a limit to human lifespan

Human life expectancy has steadily increased since the nineteenth century. Reports of supercentenarians—people who live to older than 100—together with observations of model animals whose lifespans can be extended through genetic or dietary modifications, have prompted some to suggest that there is no upper limit on human lifespan. Others say that the steady increase in life expectancy and maximum human lifespan seen during the last century will eventually stop.

To investigate, Jan Vijg, a geneticist at Albert Einstein College of Medicine in New York City and his colleagues turned to the Human Mortality Database, which spans 38 countries and is jointly run by US and German demographers. They reasoned that if there’s no upper limit on lifespan, then the biggest increase in survival should be experienced by ever-older age groups as the years pass and medicine improves. Instead, they found that the age with the greatest improvement in survival got steadily higher since the early twentieth century, but then started to plateau at about 99 in 1980. (The age has since increased by a very small amount). Their findings were published in Nature earlier this month.

Vijg’s team concluded that there is a natural limit to human lifespan of about 115 years old. There will still be occasional "outliers" who live longer, but he calculates that the probability of a person exceeding 125 in any given year is less than 1 in 10,000. The limit is surprising, says Vijg, given that the world’s population is increasing—supplying an ever-increasing pool of people who could live longer—and that nutrition and general health have improved.

But not everyone agrees with his team's interpretation. The age experiencing the greatest increase in survival may have plateaued in many countries, says James Vaupel, founding director of the Max Planck Institute for Demographic Research in Rostock, Germany. But it has not yet plateaued in some countries that are particularly relevant to this research, such as France, Italy, or Japan, the latter of which has the world’s highest life expectancy (83.7 years for those born in 2015).

Evidence against a limit to human lifespan

Some of the most convincing evidence comes from Philipp Berens and Tom Wallis at the University of Tübingen, Germany. When they re-analyzed the data presented by the Vijg study, they came to the conclusion that the paper does not provide convincing statistical evidence one way or another. If anything, Berens and Wallis argue that the Vijg data would support a linear model, under which human life expectancy would simply continue to grow linearly, opposite of their conclusion. Their full analysis is available on GitHub.

To make sure they correctly extracted the data from the Vijg paper, Berens and Wallis performed a sanity check by plotting the maximum age reported at death (MRAD) for each year between 1968 and 2006, which can be seen in the figure on the left. The plot shows the raw data points in black and separate linear fits with 95%-CIs for years before and after 1995. Here the year 1995 acts as a trend breaker: going forward, the linear fit has a different slope. (Note: It is not clear from the paper why the authors chose 1995 as a point to separate models, but later on they showed similar results with the year 1998 as a break point.)

A simple alternative hypothesis to the trend-break model put forth in the Nature paper would be that MRAD actually keeps increasing and therefore, that there is no limit to human lifespan. This much simpler alternative is termed the linear model, and IMHO should have been included in the Nature paper as a point of comparison (Occam's razor, anyone?). Here are the two models side-by-side:

Which model seems to do better here? In order to answer this question, an objective model comparison is needed.

One possible comparison is to ask which model can explain more variance in the observed data. The higher the explained variance, the better the model. Turns out that the trend-break model can explain slightly more variance than the linear model (0.42 vs. 0.29). However, it is important to consider that the trend-break model uses four parameters to do so, compared to only two parameters in the linear model. It is not surprising that a more complicated model could provide a better fit, and we therefore need to ask whether the increase in explained variance is "worth" the additional parameters. A formal way to do this is to use something called the Akaike Information Criterion (AIC). And voila, the two models have very similar AIC values, and the actual numbers (an AIC difference of -2.5) actually provide "substantial" support for the simpler linear model.

Berens and Williams then went on to produce additional metrics, such as the Bayesian Information Criterion (BIC), Bayesian Factors, and nested ANOVAs—all of which suggesting that there are no statistically-significant differences between the two models. Based on these statistical considerations, Berens and Wallis concluded that the main data reported in the Vijg paper provides no support for or against the idea that there is a limit to human lifespan.

However, Berens and Williams didn't stop there. They argued that a more appropriate statistical analysis would have been to use Extreme Value Analysis, which is a theoretical framework for modeling data with a low probability of occurrence (such as, a person living over the age of 110). Specifically, the data presented in the Nature paper are an example of block maxima, in which the maximum is computed for each "block" (a year in this case) of a distribution (here, the distribution of age-at-death). The distribution of block maxima are known to follow a Generalized Extreme Value (GEV) distribution as the number of blocks approaches infinity (see Coles, 2001 or Gilleland & Katz, 2016), which has three parameters: a location parameter that defines the center of the distribution, a scale that defines its spread, and a shape parameter that defines the weight of the tails. With these tools in hand, they first fit a GEV model to the MRAD data (location = 110 years, scale = 1.6, shape = 0.10), and then asked how the location parameter changed over time. What they found is again strong support for the linear model, but not the trend-break model. And once again, they concluded that the Vijg paper provides no evidence for a limit to human lifespan.


Or does it..?

However, Berens and Williams didn't stop there either. In Extended Data Figure 6, the Nature paper presented a different dataset (from the Gerontology Research Group) presumably providing independent evidence for the trend-break model over a linear one. This is a dataset of "verified supercentenarians" as of January 1, 2014. However, an important caveat is that the data presented in the Nature paper does not include observations between 1989-1996. Since this timespan includes the key years of the "trend break", they could have an important influence on their conclusions.

Therefore, Berens and Williams recovered the full dataset of the Gerontology Research Group and repeated their analyses. And voilà! Including the missing data substantially increased support for the authors' model over a simple linear model. The ANOVA is highly significant; the AIC difference is -15, which corresponds to "essentially no" support for the simple linear model relative to the authors' model. The BIC difference is -11, which provides "very strong") support for the authors' model over the simple linear one. Finally, the Bayes factor computed as above is about 150, which also provides very strong positive support for the authors' model.

In the end, these data actually provide strong statistical evidence that we are approaching a limit to human lifespan—a limit we cannot exceed, simply by virtue of being human. Same headline—but completely different story. According to Berens and Williams, the Nature paper is deeply flawed, not only because the authors seem to be making questionable assumptions about the distribution of age-at-depth values and the noise that affects these distributions: "A model comparison like we performed here should have clearly been part of their paper."

Sources: The original paper is available at Nature, and the full report of Berens and Williams including R code can be found on GitHub.