1. Introduction
Alzheimer’s disease (AD) is a complex and heterogeneous neurodegenerative disease. Developing novel treatments for AD requires accurate diagnosis of the disease, accurate measurement of disease progression, and reliable analysis of the data. An important analytic challenge has emerged in recent AD clinical trials. Heavily skewed data with small imbalances in the number of the most rapidly progressing patients have had a relatively large impact on differences between treatments in mean change to endpoint in two recently completed, identically designed trials in AD [
1]. Results from the primary and multiplicity-adjusted secondary outcomes were all significant in one study, with the observed difference on the primary outcome close to what had been anticipated in study planning. In contrast, none of the differences approached significance in the other study, with the primary endpoint showing essentially no difference between drug and placebo [
1].
The primary analysis in these studies was MMRM, and the primary estimand was based on the treatment policy strategy for dealing with the intercurrent event of early study drug discontinuation [
1]. Both the sponsor and FDA conducted post hoc sensitivity analyses utilizing transformations, non-parametric methods, and robust regression to deal with the large departure from normality, a key assumption in ANCOVA, MMRM, and similar analyses [
1,
2,
3,
4].
The anticipated decline in an early symptomatic Alzheimer’s disease population on CDR-SB in a 78-week period is 1–2 points [
5,
6,
7]. However, much larger declines for individual patients are common [
8,
9,
10]. These more rapidly progressing patients do not differ from other patients in demographic or baseline disease characteristics, comorbidities, concomitant medications, or the incidence of adverse events [
2,
3]. There is no single clinical feature that differentiates rapidly progressing patients from other patients [
5].
Therefore, rapidly progressing patients are not a unique subgroup with a clear definition and distinct features that separate them from the majority of patients. Instead, magnitude of progression follows a continuous distribution, with differences only in degree. Therefore, rapidly progressing patients—and the resultant skewed data—are part of the reality of Alzheimer’s disease, and after the fact, it is too late to address them in a completed randomized trial [
2,
3]. However, current practice does not often include assessments of and sensitivity analyses for non-normal data distributions. Therefore, the analytic challenge is how to plan for the skewed data in the analyses of Alzheimer’s clinical trial data.
The primary purpose of the present investigation is to compare MMRM with a non-parametric and a robust regression approach in data that are and are not heavily skewed with rapidly progressing patients. The intent is to provide insight on ways to proactively deal with the highly skewed data that have been encountered in AD clinical trials. The remainder of this paper is organized as follows:
Section 2 provides an overview of the consequences of and analytic approaches for dealing with skewed data;
Section 3 outlines the design and analysis of a simulation study to address the objective of comparing MMRM with a non-parametric and robust regression analytic approaches;
Section 4 details the results of the simulation study; and
Section 5 discusses those results in the context of the strengths and limitations of the simulation study.
2. Consequences of and Analytic Approaches for Dealing with Skewed Data
The statistical consequence of rapidly progressing patients is outliers/non-normality of the data [
1,
2,
3]. With an anticipated decline in an early symptomatic Alzheimer’s disease population on CDR-SB in a 78-week period being 1–2 points [
5,
6,
7] and with a standard deviation of approximately 2.0 [
1], if data were normally distributed, changes of greater than 6–7 would be rare. However, the data were heavily right-skewed, with ~1% of patients having >8-point change from baseline, and the largest change was 13 points [
1,
3].
Statistical theory suggests methods other than MMRM may be useful to consider when data are heavily skewed [
11]. Via the central limit theorem, in large trials, the concern regarding non-normal data is not bias; the concern is stability of results [
12,
13,
14,
15]. However, in smaller trials, bias may also be a concern.
The potential influence of heavily skewed data (or outliers) can be put into perspective by noting that an outlier with 3-fold the error magnitude of a typical observation contributes 9-fold (32) times as much to the squared error loss, and an outlier with 5-fold the error magnitude contributes 25-fold. Therefore, even a few outliers can increase variance substantially. Maximum likelihood methods such as MMRM are robust to departures from normality in the sense that the Type I error rate does not increase under violations of normality so long as sample sizes are not small (~40 patients per arm or larger). However, estimates of individual parameters and Type II error may not be so robust.
General categories of methodology for dealing with skewed distributions include robust methods and non-parametric methods. Non-parametric analyses, for example, those based on ranks or medians, are resistant to the influences of even extreme outliers because the error magnitude of ranks and medians is not inflated. One example of a non-parametric approach is the Hodges–Lehman estimator, which is essentially the median difference between groups.
Robust regression detects outliers and provides resistant (stable) results in the presence of outliers by limiting their influence. Three classes of problems have been addressed with robust regression: (1) outliers in the
y-direction (response direction); (2) outliers in the
x-direction (covariate space); and (3) outliers in both directions [
16].
Common methods for robust regression include M estimation, high breakdown value estimation, and combinations of these two methods [
16]. Huber (1973) introduced M estimation [
12]. The method is computationally and theoretically simple. The loss function reduces outliers’ contributions to the squared error loss, thereby limiting their impact on parameter estimates [
13].
Although M estimation is not robust to
x-direction outliers, it is robust to
y-direction outliers and is therefore well suited to scenarios in which the focus is on
y-direction outliers [
12,
13,
17]. Rapid progressors can be considered
y-direction outliers. Hence, it is not surprising that both the sponsor- and FDA-implemented robust regression with M estimation as a post hoc sensitivity analyses of the clinical trials mentioned in
Section 1.
3. Methods
The objectives of this study are (1) to characterize the probability of having meaningful imbalance across treatment arms in the number of rapidly progressing patients due to chance alone; (2) assess the influence of imbalances in the number of rapidly progressing patients on estimates of treatment group differences from MMRM; and (3) to compare results from MMRM with the non-parametric and robust regression methods used in previous AD clinical trials.
3.1. Simulated Data
The objectives of this study were pursued via simulation. Two main data scenarios were simulated. First, complete data were simulated and analyzed to assess results without the potentially confounding influence of non-random dropout. A second set of simulations was conducted in data with non-random subject dropout.
The complete data were simulated as a 2 × 3 × 3 factorial arrangement of scenarios, with 2 × 3 = 6 data scenarios and three methods of analysis applied to each data scenario. The simulations included the following:
Two levels of magnitude of treatment effect: zero difference between drug and placebo in mean change from baseline and a 0.5 pt difference, approximately a 25% slowing of disease progression for drug compared with placebo, which is what was assumed in planning of the aducanumab studies noted in the introduction [
1];
Three levels (types) of data distribution: a normal distribution and two skewed distributions that were created via a mixture of “normal” and “rapidly progressing” patients; in one skewed distribution, the treatment effect was the same in rapid progressors as in the main subgroup, while in the second, skewed distribution the treatment effect in rapid progressors was zero;
Three methods of analysis were used: an MMRM analysis similar to what is commonly used in AD, the Hodges–Lehmann estimator (a non-parametric approach), and robust regression with M estimation.
In each of the data scenarios, 10,000 data sets were simulated, with 450 patients per data set, randomized in a 2:1 ratio to the simulated drug and placebo arms, respectively. In each of these data sets, no data were missing.
Input parameters for the various simulated data sets are summarized in
Table 1 and
Table 2.
Figure 1 is a plot of the Mixed_1 distribution that was a mixture of a main group comprising 95% of the patients with 5% rapid progressors. The distribution in
Figure 1 is similar to the distribution of data from the AD clinical trials that motivated this investigation [
1], indicating that the simulated data provided a reasonable approximation of the clinical trial data.
An additional set of simulations was conducted to extend results to the more realistic setting of incomplete data. For this simulation, the same inputs were used as for the Mixed_1 data, except the sample size was 570, and 20% monotone missing data were generated. The mechanism for data deletion was as follows: Patients that showed any degree of improvement did not dropout, whereas patients that had some degree of worsening had a probability of dropout at visit 2 (and therefore were missing both visit 2 and visit 3 observations) of 13%, with an additional 7% drop out at visit 3 (and therefore were missing only the visit 3 observation). Because the observations that triggered dropout were deleted, and therefore, the observed data did not fully explain the dropout, the missing data mechanism was missing not at random [
18], suggesting some potential for bias in each of the three analytic methods because each assumes a missing-at-random mechanism. For the missing data scenario, 5000 data sets were simulated rather than 10,000 data sets as in complete data due to the increased computation burden for dealing with missing data in the analyses.
3.2. Analyses
The MMRM analyses were based on SAS PROC MIXED and restricted maximum likelihood estimation [
19]. Changes from baseline were modeled using treatment and visit as categorical fixed effects, and baseline score and the baseline score by visit interaction as continuous covariates. Within-subject errors were modeled using an unstructured covariance matrix.
The Hodges–Lehmann (HL) approach estimated the median of all pairwise comparisons between all patients in the treated and control groups. As such, it is a non-parametric approach that does not rely on distributional assumptions. The Hodges–Lehmann approach was implemented to test treatment group differences via SAS PROC NPAR1WAY applied to the data at visit 3 [
20]. Statistical significance was based on whether the 95% confidence interval for the median difference contained 0.
Robust regression (RR) was implemented for the visit 3 data via PROC ROBUSTREG in SAS using M estimation and the bi-square weighting function [
16]. The model included treatment as a categorical fixed effect and baseline score as a continuous covariate. Estimates were computed using iteratively reweighted least squares (IRLS) with a weighted least squares fit implemented inside an iteration loop. For each iteration, a set of weights for the observations was used in the least squares fit. The weights were constructed by applying the chosen weight function to the current residuals. Initial weights were based on residuals from an initial fit via unweighted least squares. The iteration terminated when a convergence criterion was satisfied [
17].
In the simulation scenario with missing data, the previously described MMRM analysis was fit to the incomplete data. For RR and HL, an analytic approach similar to that advocated by Mehrotra et al. [
11] was used. Multiple imputation (MI) was implemented via PROC MI [
21] using 25 rounds of imputation for each of the 5000 simulated data sets, with separate models for each treatment group that included baseline and post-baseline observations. The completed data sets were analyzed using HL and RR as previously described, and results were combined using Rubin’s rules as implemented in proc MI analyze [
22].
MI, which was used for HL and RR, and MMRM make the same assumptions about missing data and lead to asymptotically similar results as the size of the data set and number of imputations increase [
18]. Therefore, the difference in results between the various methods is not due to different means of handling missing data [
18].
3.3. Outcomes
Outcomes used to assess results included the mean difference between treatments, the mean standard error of the treatment differences, the standard deviation of the treatment differences, and the percentage of data sets in which the treatment difference was statistically significant (α = 0.05). The standard deviation in treatment differences is the empirical standard error of the treatment differences and can be compared with the mean standard error to assess whether the model standard errors accurately reflect the uncertainty in the estimates.
4. Results
Results from the normally distributed simulated data are summarized in
Table 3. Each method yielded unbiased estimates of treatment group differences when the treatment effect (Δ) was 0.00 and −0.50. The mean standard errors and the standard deviations of the treatment differences were nearly identical within each method, but were greater in HL than in MMRM and RR. The percent of data sets with statistically significant differences was approximately equal to the nominal Type I error rate when Δ = 0.00 for MMRM and RR and lower than the nominal rate for HL. When Δ = −0.50, the percent of data sets with significant differences (power) was ~2% greater for MMRM than RR and 14% greater than HL.
Results from the Mixed_1 set of simulations are summarized in
Table 4. Each method yielded unbiased estimates of treatment group differences when Δ was 0.00 and −0.50. The mean standard errors and the standard deviation of the treatment differences were nearly identical within each method but were lower for RR than for HL and MMRM; that is, in MMRM, standard errors were greater than in normally distributed data, but for RR, standard errors were similar in the normal and skewed Mixed_1 simulated data sets. With Δ = 0.00, trends for Type I error were similar to those in normally distributed data. When Δ = −0.50, unlike in normally distributed data, where power was similar for MMRM and RR, power was ~12% greater for RR than for MMRM, with MMRM being similar to HL. Although the average Δ was similar across methods, in 20% of the data sets, the estimate from MMRM differed from the corresponding estimate in RR and HL by at least 0.15. In other words, although average estimates were similar, it was not unusual for the various analyses to have meaningful differences within data sets.
Results from the Mixed_1 set of simulations were further summarized by sub-setting the data sets according to the ratio of rapid progressing patients in the two treatment arms.
Table 5 summarizes results when Δ = 0.00, and
Table 6 summarizes results when Δ = −0.50.
The mean Δs from MMRM varied consistent with the ratio or rapidly progressing patients. In the subset of data sets in which the RP ratio was ≥1.5× on drug (more rapid progressors in the drug arm), the mean Δ from MMRM was less than the corresponding simulation input values of Δ = 0.00 (
Table 6) and Δ = −0.50 (
Table 7). The opposite trend existed when the RP ratio was ≥1.5× on placebo (more rapid progressors in the placebo arm), with the mean Δ from MMRM being greater than the corresponding simulation input values. In contrast, mean Δs from RR did not appreciably vary across the categories of data subsets defined by RP ratio. Results for HL were intermediate to those of MMRM and RR. When the ratio of RPs did not appreciably differ, the mean Δs from each method of analysis was close to the simulation input values.
Given the way data were subset based on RP ratio, mean Δs differing from simulation input values is not a valid measure of bias but rather a means of assessing stability (consistency) of results.
Results from the Mixed_2 set of simulations (no treatment effect in the rapidly progressing patients) are summarized in
Table S1 by grouping the data sets according to the ratio of rapid progressing patients in the two treatment arms. The results followed the same pattern as in the Mixed_1 set of simulations, with mean Δs from MMRM varying consistently with the ratio of rapid progressing patients, whereas results from RR were consistent across data set groupings. See the
Supplemental Material for more details.
Results from the Mixed_1 set of simulations in data with 20% dropout are summarized in
Table 7. The results followed the same pattern as in the mixed_1 and mixed_2 sets of simulations that had no dropout. The advantage of RR over MMRM and HL in power was ~7% and ~10%, respectively. Each method provided control of Type I error at the nominal rate (or slightly less).
The average estimated treatment contrast was slightly greater than the input value when the treatment effect was −0.50 because the analyses assumed a missing-at-random mechanism when the actual mechanism was missing not at random.
5. Discussion
Statistical theory suggests methods other than MMRM may be useful to consider when data are heavily skewed [
11], as they were in the AD clinical trials that motivated this investigation. Via the central limit theorem, the concern in large trials regarding non-normal data is not bias; the concern is stability of results [
12,
13,
14,
15], although bias could be a concern in small trials.
This investigation showed that (1) chance alone can often result in substantial imbalance across treatment arms in the number of rapid progressing patients, and (2) these imbalances can influence estimates of treatment group differences. In over half the simulated data sets with complete data (and hence no confounding from non-random dropout), the ratio of rapidly progressing patients was at least 1.5-fold greater on one arm than the other, and treatment group differences estimated via MMRM varied in accordance with the ratio of rapid progressors on drug versus placebo. Therefore, the imbalance and its consequences as seen in the recent AD clinical trials should be anticipated in planning AD studies.
This study also provided evidence, consistent with statistical theory, supporting the usefulness of robust methods. In normally distributed data, MMRM had ~2% more power than RR and ~14% more power than HL. As expected, MMRM and RR each controlled Type I error at the nominal level, with HL having ~1/2 the nominal rate.
In the skewed data, mimicking the AD clinical trials, each of the methods yielded unbiased estimates of the treatment contrast, but standard errors were smaller, and power was greater for RR than for MMRM and HL. When the data sets were subset by the ratio of rapidly progressing patients across treatment arms, the average results from MMRM varied in accordance with this ratio. The average treatment effect in MMRM was greater when the ratio of rapid progressors was higher on placebo, and the average treatment effect was lower with more rapid progressors on drug. Results from HL showed the same trend but to a lesser degree than MMRM. Robust regression yielded the most stable results, with smaller average standard errors and similar average treatment contrasts regardless of the ratio of rapid progressors. Similar results were seen in the scenario with 20% missing data.
These results should be interpreted considering several limitations. Although the simulated data were similar to the clinical trial data that motivated this investigation, information on the distribution of outcomes from other AD clinical trials is lacking. It may be that a log-normal or Cauchy distribution better describes AD data than the mixture distribution used here to simulate data. Moreover, other analytic approaches should be considered for assessing mean changes. For example, mixture models might perform better than RR if a mixture distribution best describes AD data. However, if the data are best described by a log-normal distribution, a log-transformation prior to an MMRM analysis may be better, or perhaps, quantile regression is a useful alternative.
6. Conclusions
Therefore, further investigation is needed to compare the strengths and limitations of analytic options over a wider set of conditions. However, results of this investigation suggest that imbalances across treatment arms in the number of rapid progressors is likely in AD clinical trials. These imbalances influence results from MMRM, and it should not be assumed that MMRM is the optimum or only analysis needed in AD clinical trials.
Supplementary Materials
The following supporting information can be downloaded at
https://www.mdpi.com/article/10.3390/jdad1020007/s1, Table S1. Results from the mixed_2 simulated data with Δ = −0.50 by groups defined by the ratio of rapidly progressing patients in the drug and placebo arms.
Author Contributions
Conceptualization, C.H.M. and G.M.; methodology, C.H.M. and I.L.; validation, I.L. and S.P.D.; formal analysis, C.H.M.; investigation, S.B.H. and G.M.; resources, S.B.H.; writing—original draft preparation, C.H.M.; writing—review and editing, I.L., S.P.D., G.M., and S.B.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
This study is based on simulated data, and the methods provide sufficient detail to recreate the simulations.
Acknowledgments
Chenge Zhang (Pentara Corporation) provided editorial support for this publication.
Conflicts of Interest
Ilya Lipkovich is an employee and minor shareholder of Eli Lilly and Company. Suzanne Hendrix owns and is an employee of Pentara Corporation. Craig Mallinckrodt and Sam Dickson are employees of Pentara Corporation, a company that consults with dozens of companies in the Alzheimer’s space, including Eli Lilly CO. However, this work was not part of that arrangement, and all individuals from these two companies collaborated without contractual obligation. Pentara Corporation funded this research.
References
- Budd-Haeberlein, S.; Aisen, P.S.; Barkhof, F.; Chalkias, S.; Chen, T.; Cohen, S.; Dent, G.; Hansson, O.; Harrison, K.; von Hehn, C.; et al. Two Randomized Phase 3 Studies of Aducanumab in Early Alzheimer’s Disease. J. Prev. Alzheimer’s Dis. 2022, 9, 197–210. [Google Scholar] [CrossRef]
- Dickson, S.P.; Hennessey, S.; Nicodemus Johnson, J.; Knowlton, N.; Hendrix, S.B. Avoiding future controversies in the Alzheimer’s disease space through understanding the aducanumab data and FDA review. Alzheimers Res. Ther. 2023, 15, 98. [Google Scholar] [CrossRef] [PubMed]
- Mallinckrodt, C.; Tian, Y.; Aisen, P.; Barkhof, F.; Cohen, S.; Dent, G.; Hansson, O.; Harrison, K.; Iwatsubo, T.; Mummery, C.J.; et al. Investigating Partially Discordant Results in Phase 3 Studies of Aducanumab. J. Prev. Alzheimer’s Dis. 2023, 10, 171–177. [Google Scholar] [CrossRef]
- Mallinckrodt, C.H.; Roger, J.; Chuang-Stein, C.; Molenberghs, G.; Lane, P.W.; O’kelly, M.; Ratitch, B.; Xu, L.; Gilbert, S.; Mehrotra, D.V.; et al. Missing data: Turning guidance into action. Stat. Biopharm. Res. 2013, 5, 369–382. [Google Scholar] [CrossRef]
- Coric, V.; van Dyck, C.H.; Salloway, S.; Andreasen, N.; Brody, M.; Richter, R.W.; Soininen, H.; Thein, S.; Shiovitz, T.; Pilcher, G.; et al. Safety and tolerability of the γ-secretase inhibitor avagacestat in a phase 2 study of mild to moderate Alzheimer disease. Arch. Neurol. 2012, 69, 1430–1440. [Google Scholar] [CrossRef]
- Egan, M.F.; Kost, J.; Voss, T.; Mukai, Y.; Aisen, P.S.; Cummings, J.L.; Tariot, P.N.; Vellas, B.; Van Dyck, C.H.; Boada, M.; et al. Randomized Trial of Verubecestat for Prodromal Alzheimer’s Disease. N. Engl. J. Med. 2019, 380, 1408–1420. [Google Scholar] [CrossRef]
- Ostrowitzki, S.; Lasser, R.A.; Dorflinger, E.; Scheltens, P.; Barkhof, F.; Nikolcheva, T.; Ashford, E.; Retout, S.; Hofmann, C.; Delmar, P.; et al. A phase III randomized trial of gantenerumab in prodromal Alzheimer’s disease. Alzheimers Res. Ther. 2017, 9, 95, Correction in Alzheimers Res. Ther. 2018, 10, 99. [Google Scholar] [CrossRef]
- Abu-Rumeileh, S.; Capellari, S.; Parchi, P. Rapidly Progressive Alzheimer’s Disease: Contributions to Clinical-Pathological Definition and Diagnosis. J. Alzheimer’s Dis. 2018, 63, 887–897. [Google Scholar] [CrossRef] [PubMed]
- Wang, P.; Lynn, A.; Song, Y.E.; Haines, J.L. Distinct features of rapidly progressive Alzheimer’s disease. Alzheimer’s Dement. 2022, 18, e063951. [Google Scholar] [CrossRef]
- Schmidt, C.; Wolff, M.; Weitz, M.; Bartlau, T.; Korth, C.; Zerr, I. Rapidly progressive Alzheimer disease. Arch. Neurol. 2011, 68, 1124–1130. [Google Scholar] [CrossRef] [PubMed]
- Mehrotra, D.V.; Li, X.; Liu, J.; Lu, K. Analysis of Longitudinal Clinical Trials with Missing Data Using Multiple Imputation in Conjunction with Robust Regression. Biometrics 2012, 68, 1250–1259. [Google Scholar] [CrossRef] [PubMed]
- Huber, P.J. Robust Regression: Asymptotics, Conjectures and Monte Carlo. Ann. Stat. 1973, 1, 799–821. [Google Scholar] [CrossRef]
- Huber, P.J. Robust Statistics; John Wiley & Sons, Inc.: New York, NY, USA, 1981. [Google Scholar]
- Rousseeuw, P.J.; Leroy, A.M. Robust Regression and Outlier Detection; John Wiley & Sons, Inc.: New York, NY, USA, 1987. [Google Scholar]
- Hampel, F.R.; Ronchetti, E.M.; Rousseeuw, P.J.; Stahel, W.A. [1986]. Robust Statistics: The Approach Based on Influence Functions; Wiley: Hoboken, NJ, USA, 2005. [Google Scholar]
- SAS Institute Inc. Chapter 104 The ROBUSTREG Procedure. SAS/STAT® 15.1 User’s Guide; SAS Institute Inc.: Cary, NC, USA, 2018. [Google Scholar]
- Holland, P.; Welsch, R. Robust Regression Using Interactively Reweighted Least-Squares. Commun. Statist. Theor. Meth. 1977, 6, 813–827. [Google Scholar] [CrossRef]
- Mallinckrodt, C.H.; Lipkovich, I. A Practical Guide to Analyzing Longitudinal Clinical Trial Data; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
- SAS Institute Inc. Chapter 81 The MIXED Procedure. SAS/STAT® 15.1 User’s Guide; SAS Institute Inc.: Cary, NC, USA, 2018. [Google Scholar]
- SAS Institute Inc. Chapter 87 The NPAR1WAY Procedure. SAS/STAT® 15.1 User’s Guide; SAS Institute Inc.: Cary, NC, USA, 2018. [Google Scholar]
- SAS Institute Inc. Chapter 79 The MI Procedure. SAS/STAT® 15.1 User’s Guide; SAS Institute Inc.: Cary, NC, USA, 2018. [Google Scholar]
- SAS Institute Inc. Chapter 80 The MIANALYZE Procedure. SAS/STAT® 15.1 User’s Guide; SAS Institute Inc.: Cary, NC, USA, 2018. [Google Scholar]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).