Heterogeneous Continual Learning
Presented by: Lucas Wu
Def. Continual Learning
Def. Continual Learning
help the latter ones
Current Approach-1
Replay method
Current Approach-2
Regularize the model change
(Reduce the weight variation)
Current Approach-3
Parameter-isolated methods
Same architecture
Can be any CL method
Motivation from real world examples
Motivation from real world examples
Motivation from real world examples
Motivation from real world examples
Motivation from real world examples
Current Approach-1
Replay method
Current Approach-2
Regularize the model change
(Reduce the weight variation)
Parameter-isolated methods
Current Approach-3
Parameter-isolated methods
Questions?
Sketch of the solutions
week
strong
Sketch of the solutions
week
strong
Sketch of the solutions
probability distribution
Soft CE: Cross-Entropy loss(Difference between predicted and true labels)
KL Divergence: method for comparing prediction probability distribution
Sketch of the solutions
probability distribution
Soft CE: Cross-Entropy loss(Difference between predicted and true labels)
KL Divergence: method for comparing prediction probability distribution
Augmentation:
Keshigeyan Chandrasegaran, Ngoc-Trung Tran, Yunqing Zhao, and Ngai-Man Cheung. Revisiting label smoothing and knowledge distillation compatibility: What was missing? In Proceedings of the International Conference on Machine Learning (ICML), 2022. 2, 4
Sketch of the solutions
Hyper parameter for label smoothing
Objective:
Sketch of the solutions�(w/ buffer)
Hyper parameter for label smoothing
Objective:
knowledge distillation Loss
Buffer: size==200, same to replay
How to extract features�(w/o buffer)
Objective:
How to extract features
Objective:
10%, random select from previous tasks
Encourage spatial continuity in the generated images, thus avoiding excessive noise and unnatural patterns.
0.5K iterations
How to extract features
Objective:
10%, random select from previous tasks
Used to encourage pixel-level similarity between the generated image and the original image.
0.5K iterations
How to extract features
Objective:
10%, random select from previous tasks
Encouraging the generated image to be similar to the target image in the feature space.
0.5K iterations
How to extract features
Objective:
Refers to DeepInversion
Deep Inversion (DI) VS Quick Deep Inversion (QDI)
Dog
Dog
DeepInversion V.S. Quick DeepInversion
Experiment setting
Average accuracy
Average forgetting
Evaluation metrics
Task-incremental continual learning.
Class-incremental continual learning.
Conclusion
Best performance in :
Ablation study:
Limitation
Questions?