1 of 26

Transfer Learning for Cognitive Reserve Quantification

SMI 2023

Seonjoo Lee

Associate Professor of Clinical Biostatistics (in Psychiatry), Department of Biostatistics and Psychiatry

Member of Data Science Institute, Columbia University

Mental Health Data Science, New York State Psychiatric Institute

Image is from https://brain-smart.com/learning/turning-learning-into-action-with-a-little-help-from-our-brain/

2 of 26

This work was supported by the R01AG062578-01A1 (PI: Lee), R01AG026158 (PI: Stern), R01AG038465 ((PI: Stern), K01MH122774 (PI: Zhu) and Brain and Behavior Research Foundation Grant (PI: Zhu)

3 of 26

Study Aim

  • Quantify cognitive reserve using MRI measures to study pre-clinical AD participants from multi-site and multi-scanner studies.
  • Unbiased cognitive reserve quantification by leveraging lifespan data.
  • Generalizable quantification using transfer learning to account for study differences (e.g. different neuropsychological assessments but same cognitive construct).
  • Cascade neural network (CNN) was chosen

4 of 26

Cognitive Reserve

  • “The concept of cognitive reserve provides an explanation for differences between individuals in susceptibility to age-related brain changes or pathology (e.g., Alzheimer's disease), whereby some people can tolerate more of these changes than others and maintain function.” (Stern, 2002)
  • Lifetime exposures including educational and occupational attainment, and leisure activities in late life, can increase this reserve (Stern, 2012).

Stern, Yaakov. "What is cognitive reserve? Theory and research application of the reserve concept." Journal of the International Neuropsychological Society 8, no. 3 (2002): 448-460.

Stern, Yaakov. "Cognitive reserve in ageing and Alzheimer's disease." The Lancet Neurology 11, no. 11 (2012): 1006-1012.

5 of 26

Quantification of Cognitive Reserve

  • Typical CR proxy measures include years of education (Meng & D'Arcy, 2012; Stern et al., 1994), premorbid IQ (Alexander et al., 1997), occupational achievement, and engagement in cognitively and socially stimulating activities (Scarmeas & Stern, 2003).
  • The difference (residual) between predicted cognitive performance based on individual’s level of brain status and neuropathology and the actual individual’s performance (Reed et al., 2010).
    • High-reserve individuals exhibit higher actual measured cognitive performance than that predicted.

6 of 26

Motivation of the current study

  • To quantify cognitive reserve for the preclinical AD.
  • Structural magnetic resonance imaging (sMRI) has been used as a measure of the regional brain atrophy underlying cognitive decline and dementia (Mueller, Schuff, & Weiner, 2006).
  • Typically, CR is based on the regression models from older participants (Reed et al., 2010; Zahodne et al., 2015; Zahodne et al., 2013).
  • To better quantifying cognition, using the life-span may yield more accurate results. However, leveraging life-span brain and cognition data in quantifying cognitive reserve has not been done.

7 of 26

Why transfer learning?�Why does direct estimation from a previously trained model fail?

  • Brain imaging data from multi-sites may have high variability due to different MRI sequences of different scanners
  • Different neuropsychological assessments for the same domain (e.g. memory).

8 of 26

Datasets

RANN

    • N=496, 20-80 years old
    • Single site, single scanner
    • Same image acquisition protocol with ADNI
    • Memory Score: sub-scores of the SRT: the long-term storage sub-score, continuous long-term retrieval, and the number of words recalled on the last trial. 

HCP-Aging

    • N=620, 36-90 years old
    • 4 site, same scanner vender
    • Modified HCP
    • Memory Score: Picture Sequence Memory Test and Rey Auditory Verb al Learning Test (RAVLT).

ADNI

    • N=941, 55-90 years old
    • 52 sites, different scanners
    • ADNI protocol (3T MPRAGE)
    • Memory Score: Rey Auditory Verbal Learning Test (RAVLT, 2 versions), ADAS-Cog, MMSE, and Logical Memory. 

9 of 26

10 of 26

Prediction model �Cascade Neural Network

  • class of neural network which is similar to feed-forward networks, but include a connection from the input and every previous layer to following layers.
  • In a network, the output layer is also connected directly with the input layer beside with hidden layer.
  • Advantages (Fahlman, 1990)
    • learn very quickly
    • the network determines its own size
    • retain the structures it has built even if the training set changes
    • require no back-propagation of error signals through the connections of the networks
  • It has been shown to outperform the other common classical machine learning approaches in brain-age prediction (Chen et al., 2020).

11 of 26

Transfer Learning

  • Transfer learning is a machine learning technique that utilizes the knowledge gained from one task and applies it to a different but related task.
  • A popular optimization approach that allows rapid progress or improved performance when modeling the second task.
  • The sMRI obtained from various sites or scanners may represent similar brain properties but may exhibit different observational distributions. Thus, the transfer learning approach may be applied to improve the generalizability of the sMRI-based residual models.
  • In this study, we proposed a CR quantification framework that leverages a single-site, large lifespan data and uses transfer learning to handle scanner and site differences.

12 of 26

Source Domain - CNN

  • The RANN dataset was split into the training set (70%) and test set (30%) using a conditionally random method.
  • Cascade neural network models were used to train the RANN dataset for memory prediction.
  • Hyperparameter Tuning: Random Search on the hyperparameters of the model (numbers of hidden layers, numbers of neurons, penalty of regularization and types of activation function)
  • Optimization function: mean square error function
  • 10-fold cross-validation
  • Performance comparison: Pearson’s correlation coefficient and mean absolute error (MAE) between the predicted and true memory

13 of 26

Target Domain – transfer learning

  • TL
    • Randomly divided the whole HCPA or ADNI datasets into the tuning pool and test sets (tuning set 70%, test set 30%)
    • In the refined optimization procedure of the transfer learning, the optimal tuning sample, the regularization ratio (0 to 1), the loss function (i.e. mean square error), and the choice which layers were frozen
  • Transfer learning with cotrain (TLCO)
    • TLCO integrated both training set from source domain and tunning set from target domain to tune the pre-trained CNN model
    • The TLCO approach accounts for inter-site differences through statistical variance analysis via regressing out site-specific differences by using statistical covariates. 

14 of 26

Demographics

15 of 26

Demographics by scanner type (ADNI)

16 of 26

17 of 26

Source Domain�RANN dataset after random search

A) Training set

B) Test set

18 of 26

Transfer from RANN to HCPA

Tuning set using HCPA

Test set after tuning using HCPA

Test set with no tuning

19 of 26

ADNI dataset, while applied pretrained model from RANN

Tuning set using ADNI data

Test set after tuning using ADNI data

Test set if applying the pretrained model directly

20 of 26

ADNI dataset by scanner types using random searching models

21 of 26

Model performance for ADNI datasets by scanning manufacturers using random searching model

22 of 26

Correlation between CR and CR proxy measures by Clinical Stage

23 of 26

Correlation between CR and CR proxy measures by Scanner

24 of 26

Summary

  • Quantification of cognitive reserve using brain measures for pre-symptomatic Alzheimer’s patients can be estimated by leveraging lifespan data.
  • Multi-center, multi-scanner, multi-sequence can affect the performance of the quantification.
  • Leveraging lifespan data from a single site can improve the performance.
  • Transfer learning allows the pre-trained network to successfully reconstruct the dataset acquired from different domains or age groups.

25 of 26

Postdoctoral Fellow Positions!

  • 2 positions are available
  • Postdoctoral fellow in collaboration with Anxiety clinic (Drs. Xi Zhu and Seonjoo Lee will jointly supervise)
  • Postdoctoral fellow in Mental Health Data Science (Drs. Ying Liu and Seonjoo Lee will jointly supervise)
  • Please contact to seonjoo.lee@nyspi.columbia.edu

26 of 26

THANK YOU!

Seonjoo Lee

Seonjoo.lee@nyspi.columbia.edu