Human survival after age 100: Good luck or good health?

The existence of a super-select group of centenarians that demonstrates increased survivorship has been hypothesized. However, it is unknown if this super-select group possesses similar characteristics apart from extreme longevity. we We use to Danish Birth Cohort studies are prospective studies that collected information of the health of the participants when they turned 100 years. Informed consent was obtained from participants when they were tested. In the case where individuals participated through a proxy responder, the proxy responder provided informed consent. Date of death was obtained from the Danish Civil Registration System. All methods were carried out in accordance with relevant guidelines and regulations.


Abstract Background
The existence of a super-select group of centenarians that demonstrates increased survivorship has been hypothesized. However, it is unknown if this super-select group possesses similar characteristics apart from extreme longevity.

Methods
In this study, we analyse high-quality health and survival data of Danish centenarians born in 1895, 1905 and 1910. We use Latent Class Analysis to identify unobserved health classes and to test whether these super-select lives share similar health characteristics.

Results
We nd that, even after age 100, a clear and distinct gradient in health exists and that this gradient is remarkably similar across different birth cohorts of centenarians. Based on the level of health, we identify three clusters of centenarians -robust, frail and intermediate -and show that these groups have different survival prospects. The most distinctive characteristic of the robust centenarians is the outperformance in different health dimensions (physical, functional and cognitive). Finally, we show that our health class categorizations are good predictors of the survival prospects of centenarians.

Conclusions
There is a clear strati cation in health and functioning among those over 100 years of age and these differences are associated with survival beyond age 100.

Background
Those who live to the oldest ages, particularly centenarians, are a select group (1). Medford et al. (2) discuss the possibility of an additional layer of selection among centenarians -a so called "super-select" group -that consistently survives the longest beyond the age of 100 years. These individuals are the frontrunners of longevity, surviving as far as the 95 th percentile of the distribution of lifespans above age 100 (i.e. beyond age 105) (2) and they exhibit greater improvements in their individual lifespan than other centenarians (3,4). Though some may be robust from birth, resilience at younger ages does not necessarily translate into resilience during old age because an individual is exposed to the risk of sickness over their entire life course and may become in rm before reaching old age. It was previously believed that at extreme ages, survival chances were largely random and more driven by stochastic determinants than anything else (5). However, Medford et al. (2) postulate that the super-select group of lives bene ts most from improvements in medical technology and healthcare advances and are best positioned to take advantage of further increases in human lifespan. This hypothesis implies that i) the super-select might share similar traits, ii) such traits might be common across different birth cohorts and iii) survival to extremely old ages may not be as random as some suggest. Therefore, a better understanding of the characteristics of exceptionally long-lived individuals may help to shed light on what is required for healthy aging.
Apart from extreme longevity, what traits distinguish the super-select? Centenarians have defeated death for at least 100 years, yet, no centenarian is exactly the same as another (1). This uniqueness is due to different lifestyles (6)(7)(8), behaviour (9,10), genetics (11), physiological make up (12)(13)(14), environmental determinants (15), exposure to prior and ongoing medical treatment (16)(17)(18) and many unobserved or unobservable factors (19) that ultimately lead to disparate lifespans. Most centenarians die within the rst two years after reaching age 100 with relatively few surviving much longer [1] (20). Heterogeneity in the context of individual lifespans and, in both observed and unobserved traits, is therefore natural and common among centenarians (21). This inherent heterogeneity entails that some centenarians will make it to the frontier of survival (22) by chance and not necessarily because of any traits that they have in common with the super-select (5). Similarly, some might be categorized as super-select but will die soon after their 100 th birthday. Therefore, in order to correctly determine the traits of the super-select, it is paramount that the issue of heterogeneity is carefully addressed.
Previous studies on the health of nonagenarians (i.e. 93-95 years old) (23) provide valuable hints on the expected traits to be found in the super-select centenarians (i.e. 95 th percentile of the distribution of lifespans above age 100, beyond age 105 (2)). By using cluster analysis to control for heterogeneity in health, some researchers (24,25) have shown that nonagenarians can be categorized according to speci c health classes, where one class has a consistent advantage in relation to the others. It has also been shown that factors which are usually good at differentiating and predicting survival at younger ages (e.g. smoking, obesity level, education, number of chronic diseases) do not explain survival differences among nonagenarians (26). Instead, cognitive and physical abilities and to some extent, an optimistic personality, are regarded as strong predictors (26)(27)(28). Further, survival among nonagenarians is improving across cohorts (29). These improvements are accompanied by better health and functioning across the health spectrum (30)(31)(32)(33).
It cannot be taken for granted that the associations between health and survival previously shown for nonagenarians will automatically apply for those aged 100 or more. These associations (26,30) cannot be blindly extrapolated to centenarians (or individuals surviving beyond age 100), because only 10-15% of nonagenarians make it to age 100 (20). Furthermore, studies in Denmark and Sweden have shown that improvements in survival for centenarians are negligible when looking at the median and mean lifespan above age 100 (20). Survival improvements for Denmark are observed for only a relatively small proportion, the super-select (i.e. the 95 th percentile of the distribution of lifespans above age 100, above age 105) (2) and are not present for Sweden. Therefore, the assessment of health characteristics among centenarians is important to understand if survival above age 100 is a random process (i.e. due to "luck") or if there are patterns that drive the survival improvements of the super-select. No commonalities among health characteristics might explain the lack of survival improvements observed in the mean lifespan of centenarians (20).
The aim of the study is to reveal the health characteristics that distinguish super-selected lives surviving more than 100 years. We hypothesize that the super-select are the most resilient centenarians in terms of health, by virtue of their capacity to enhance their survival chances and reach the frontier of human survival. Robustness is therefore linked with the plasticity of ageing at the individual level, in the sense that, the most robust individuals exhibit greater malleability in their lifespans. We identify robustness via the analysis of high-quality data from the 1895, 1905 and 1910 Danish Birth Cohort Studies (34) with a statistical technique known as Latent Class Analysis (35,(49)(50)(51)(52)(53). We test the predictive power of our ndings by computing the Area Under the Curve statistic (AUC, see e.g. Robin et al. (36)). showing that these characteristics are related to the survival of nonagenarians 27 . The Chair Stand test was used to assess physical ability: individuals who can stand up from a chair without the use of arms are in better physical health than those who need to use hands or those who cannot (37). Functional status was assessed by ve questions regarding the ability to perform activities of daily living: bathing, dressing, toileting, ability to walk and feeding. Individuals were divided into not disabled, moderately disabled and disabled according to the Katz' disability score calculated on their answers (38). The cognitive status of centenarians was evaluated using the Mini-Mental State Examination (MMSE). The higher the MMSE score, the better the cognitive status (0-30). We divided it into three categories: [24][25][26][27][28][29][30] indicates no cognitive impairment, 18-23 mild cognitive impairment and 0-17 severe cognitive impairment (39). Self-rated health was assessed with the question: "How do you consider your health in general?". It was divided in three categories: "excellent or good", "acceptable" and "poor or very poor" (40).
These four indicators of health had missing values. To handle them without introducing bias into our results, we performed data imputation by taking advantage of other information in the survey that was not included in the analysis. We created a "non tested" category for Chair Stand, MMSE and Self-Rated health. For the Chair Stand score, those individuals with missing values who could not perform the physical performance ADL Strength test were the ones included in the "non tested" category. For MMSE and Self-Rated health, we categorized those individuals that reported missing values, but with the answers provided by a proxy respondent, as "non tested". The rationale being that these tests cannot be performed by proxy respondents. For the Katz's disability score we did not create a "non tested" category.
However, this score reported very few missing values (2 individuals in each cohort). The creation of the "non tested" category allowed us to considerably reduce the number of missing values for participants who were unable to respond due to ill health (41). However, there were still some missing values in the dataset. Thus, we remove individuals who have missing values in at least one of the variables in the analysis [1] .
The date of death of each centenarian in Denmark (participants and non-participants) was retrieved from the Danish Civil Registration System. Some survey participants died before turning age 100 (e.g. ages 99.7, 99.5, etc.). We excluded these individuals from the main analysis to avoid immortal time bias in the calculation of survival probabilities (42). Finally, we also conducted a sensitivity analysis to test the effect of removing those participants that did not reach age 100 in each cohort (1895, 1905 and 1910). All sensitivity analyses and robustness checks are included in the Supplemental Material.

Statistical analysis
We perform a Latent Class Analysis (LCA) to shed light on the unobserved heterogeneity in health among Danish centenarians. LCA is a statistical method used to identify unobserved classes of individuals via observed categorical variables (43,(49)(50)(51)(52)(53). By considering several individual characteristics, the LCA determines individual probabilities of belonging to the latent classes and probabilities of nding a person with a certain characteristic in each class. More details about the LCA can be found in the Supplemental Material. Individuals in each class share similar characteristics and at the same time, they are different from individuals in other classes. Our aim is to identify health classes to further contrast the survivorship of individuals belonging to each of them. We consider different dimensions of health in the LCA: physical health (Chair Stand test), functional status (Katz's disability Index), cognitive impairment (MMSE) and emotional wellbeing (Self-Rated Health). It is known that there are sex differences in health and survival among centenarians (44). For this reason, we included sex as a covariate that allows us to place individuals into classes (35). We could not stratify the analysis by sex because of the small number of males in the study population.
To test the robustness of our results, we performed three different sensitivity analyses. First, we included Smoking in the LCA in addition to the four health indicators mentioned above. While it has been shown that smoking is not related to survival at the highest ages (26), we performed this additional analysis to determine how the inclusion of an unrelated health indicator affects our results. Second, given that most centenarians are women, we performed an extra analysis by only considering females in the LCA. Finally, we performed a LCA by including all individuals that died before age 100.
We performed LCA for each cohort. Since individuals in the 1895 cohort are not directly comparable to the ones in 1905 and 1910 due to differences in the questionnaire used and their survival trajectories differ from the non-participants (see details in Data section), we present the analysis of the 1895 cohort in the Supplemental Material and focus here on the 1905 and 1910 cohorts. For each cohort, various LCAs were performed by changing the number of classes in each iteration, from two to six. We considered six health classes to be the maximum possible in each cohort. More than six classes would imply high heterogeneity in health patterns but also small and meaningless classes. The optimal number of classes was selected by looking at the Akaike and Bayesian Information Criteria (AIC and BIC respectively) but also considering the health patterns and size of each class. Once the optimal number of classes in each cohort was obtained, each centenarian was assigned to a single health class. Then, based on their ages at death, we computed survival curves and the associated 95% con dence intervals by health class and by cohort using the Kaplan-Meier estimator. We assess whether there are differences in survival among the different classes by computing the log-rank test.
Finally, we estimated the area under the curve (AUC) to test the ability of health classes to predict the chance of surviving to the frontier of survival. The AUC ranges from 0 to 1; a higher AUC implies a better prediction (36). We de ne the frontier of survival (2,45) as the 95th percentile of the centenarian age-at-death distribution. Note that such ages change across cohorts according to mortality improvements. In Table 1 we show such ages and values for the AUC calculated for different percentiles.
[1] The use of statistical imputation techniques like mean substitution or multiple imputation was avoided because these procedures might bias the results of the Latent Class Analysis and make comparisons among cohorts more uncertain. Therefore, we performed the analysis considering only the individuals that have complete values.

Results
Results from the Latent Class Analysis (LCA) indicate that the optimal number of health classes for the 1905 and 1910 cohorts is three (see Supplemental Material). For the 1895 cohort the optimal number of health classes is two, which indicates that there is less heterogeneity in health for this cohort possibly due to health selection. Indeed, as indicated in Section 3, survival trajectories of survey participants are statistically different to those that did not participate in the survey (see Supplemental Material). Therefore, the results for the 1895 cohort are not nationally representative. In this section, we describe and compare the results of the 1905 and 1910 cohorts only (which are country representative). Results for the 1895 cohort can be found in the Supplemental Material.
Sex, included in the model as a covariate, is not statistically signi cant in either of the cohorts. This could be because most of centenarians are females (around 80% in each cohort). In the Supplemental Material we include a sensitivity analysis where only females are considered. The LCA health classes obtained from females-only analysis are practically the same as the ones obtained in the original analysis. This could be attributed to the fact that most of centenarians are women but also that health differences among sexes are already present in the health dimensions included in the LCA.
Every LCA class is composed of individuals who share similar health characteristics. Figure 1 shows the composition of each class for the 1905 and 1910 cohorts. Based on their characteristics, we denote the classes as robust, frail and intermediate. Each bar represents a health characteristic and the size of the coloured bar depicts the probability of depicting such characteristic. For example, robust centenarians have a 44% chance of being able to stand up from a chair with the use of hands (aqua green bar) and a 56% of being able to do so without using hands (dark green bar).
Robust centenarians comprise around 117 individuals (60%) of the 1905 and 90 individuals (40%) of the 1910 cohort population. They are likely to stand up from chairs by using their arms and have high probabilities of not being physically disabled at all or being only moderately disabled. It is likely that most of them do not show signi cant cognitive impairment. The majority perceive their health as good. Frail centenarians on the other hand, are likely to not being able to stand up from a chair and reporting physical disability. Due to their poor health, the majority of them could not be tested for their cognitive status and emotional wellbeing. Frail centenarians comprise 16% and 17% of the 1905 and 1910 cohorts respectively (around 35 individuals in each cohort). Finally, the intermediate health class comprises 24% and 42% of the 1905 and 1910 cohorts respectively. This class includes centenarians who physically and cognitively perform worse than the robust centenarians. Most of them perceive their own health to be good or acceptable.
There are many similarities in the composition of each cluster when compared across cohorts (Figure 1). The characteristics of robust centenarians are almost identical in the 1905 and 1910 cohorts. This is also true for the intermediate and frail classes. Despite not being directly comparable, the robust health class in the 1895 cohort resembles the robust health classes in the 1905 and 1910 cohorts (see Supplemental Material). These commonalities in health classes across cohorts support our hypothesis about a group of centenarians outperforming in health outcomes. Thus, the question arises: are the robust centenarians also outperforming in survival? To answer this question, we computed survival curves and the associated 95% con dence intervals for the three health classes found in each cohort. Figure 2 shows the results for the 1905 and 1910 cohorts. Figure 2 shows clear differences in survival among health classes with generally non-overlapping con dence intervals. Note, however, that at the very highest ages, the con dence bands grow wider and tend to overlap due to the very small number of survivors at those ages. Nonetheless, the log-rank test con rms formally that the three survival curves are statistically distinct (see Supplemental Material). Robust centenarians live longer than those in the other two health classes. In the 1905 cohort, their probability of survival to 105 is 0.12. For the 1910 cohort, the equivalent survival probability is 0.17, which is almost six times that for those in the frail health class. A survival gap between the robust and frail classes is also present in the 1895 cohort (see Figure A1 in Supplemental Material).
Next, we tested the ability of health classes to predict survivorship to the frontier of survival, (de ned by Medford et al (2019) as the 95 th percentile of the centenarian age-at-death distribution) by computing the AUC (area under the curve). Depending on the percentile, AUC ranged between 0.65 and 0.68 for the 1905 cohort and 0.71 and 0.76 for the 1910 cohort (see Table 1). For the 1895 cohort, the area under the curve was estimated to be around 0.70. The AUC shows that the health class is a good predictor for reaching the frontier of survival. These ndings are consistent across cohorts.
In a previous study, Thinggaard et al. (26) showed that the combination of Chair Stand and MMSE scores are good predictors of survival among nonagenarians so we compare the predictive ability of this approach with our LCA health classes [1]. Both approaches (LCA health classes and Thinggaard et al. (26)) are useful in determining the survival chances to extreme ages (see Table 1 and Table A9 in the Supplemental Material). However, our LCA health classes provide a more thorough description of individual health, enabling us to identify similarities in the health of centenarians. The LCA health classi cation provides a framework to determine the traits involved in the optimal pathways of healthy ageing. Finally, we tested the robustness of our results by performing three sensitivity analyses of the LCA: i) considering only females in the analysis, ii) including smoking, and iii) including those individuals that did not survive to age 100. In all three analyses we obtained similar results to the ones from the original analysis. Thus, we conclude that our analysis adequately captures the relationship between unobserved health categories and survival at extremely old ages. The sensitivity analyses can be found in the Supplemental Material.
[1] Following Thinggaard et al (2016) approach, we categorise those that can stand up with and without hands from their chair and having a MMSE>24 as robust. Under this approach, the AUC ranged between 0.60 and 0.72 for the cohort 1905 and between 0.61 and 0.66 for the cohort 1910. See Table A12 in the Supplemental Material.

Discussion
Those surviving to the oldest ages (i.e. beyond age 105) had better health at age 100 than other survivors from their cohort. The major contributions of this study are that i) we show the existence of a clear strati cation in health and functioning among those 100 years of age and ii) we shed light on the characteristics of the super-select centenarians (i.e. those surviving to age 105 and above). To do so, we use a high quality dataset (34) and consider different dimensions of health: physical health (Chair Stand test), functional status (Katz's disability Index), cognitive impairment (MMSE) and emotional wellbeing (Self-Rated Health) which when taken together provide a well-rounded view of centenarian health and functioning.
The majority of centenarians are females and the most distinctive characteristics of the robust cluster versus the other health clusters stem from their outperformance in physical, functional and cognitive health. Most of them perceive their own health to be good or excellent. This perhaps could explain the upward trend in lifespans previously observed within this group (2). In contrast, the intermediate and frail individuals show greater levels of physical and cognitive impairment and they have lower chances of surviving in comparison to those in the robust health class.
It was previously believed that at highest ages, the chances of survival were mostly random events. This school of thought suggests that survival is driven by stochastic determinants (5). In reality, human survival is more idiosyncratic than this. We show that even at age 100 there are clear disparities in the survival prospects of individuals based on their level of health. The extension in lifespan for centenarians is due to survival of the healthiest centenarians rather than keeping the frailest alive (46,47). This pattern was evident in all the centenarian cohorts of our study. Indeed, our study revealed evidence of a superselect group who are in better health and survived the longer than the other centenarians. They were present in all the cohorts studied here: clearly identi ed in the 1905 and 1910 cohorts and slightly less clear cut in the 1895 cohort. However, we also show that there is selection in the 1895 cohort because the survival trajectories of the survey participants are statistically different (higher survival probabilities) than those that did not participated in the survey. Therefore, the results of the 1895 cohort should be taken with caution.
One clear limitation of this study is that health characteristics are recorded only at age 100 but decline is likely to be rapid after then. At very old ages, health deterioration is likely to appear from one year to another (49). Still, the data used in these analysis measures a su ciently wide range of functioning so that it reasonably depicts an individual's general health status (30,34). Likewise, it is unknown if similar ndings are observed among the centenarians of other countries. In Sweden, for example, Medford et al. (2) do not nd a super-select group with increased plasticity of individual lifespans. It would be interesting to determine if a robust health-class is found in Sweden and to compare the results with our ndings.

Conclusion
We conclude that survival advances beyond age 100 are mainly driven by this super-select group of the healthiest individuals surviving for a longer time. This is not to say that those in poor health have not been living longer as well. They have been. However, the super-select lives have been living longer than any other group and any further pushing of the frontier of survival forward will most likely be by those in the most robust health and not those in poor health. Any improvements in the dimensions of health studied here could lead to a higher prevalence of robust centenarians and ultimately to a longer living population. The Danish Birth Cohort studies are prospective studies that collected information of the health of the participants when they turned 100 years. Informed consent was obtained from participants when they were tested. In the case where individuals participated through a proxy responder, the proxy responder provided informed consent. Date of death was obtained from the Danish Civil Registration System. Data supporting the ndings of this study were used under license of the Regional Committees on Health Research Ethics for Southern Denmark (https://en.nvk.dk/). Data are available from the authors upon request.

Declarations
Competing interest: The authors declare that they have no competing interests. Consent for publication: Not applicable.