Learning Without Forgetting (LwF)
Presenters: Irene Tenison, Sai Aravind Sreeramadas
Introduction - Objective
LwF Setting
𝛳s
𝛳o & 𝛳n
Relevant Methods - Feature Extraction
Relevant Methods - Fine Tuning
Relevant Methods - Joint Training (Multitask learning)
Comparison of Methods
LwF Goal
Given a CNN with shared parameters, 𝛳s , and task specific parameters of previous tasks, 𝛳o, the goal of LwF is to add task specific parameters for the new task, 𝛳n, and to learn parameters (𝛳s, 𝛳o, 𝛳n) that works well on the old and the new tasks using data from the new task only.
Method
Starts With:
𝛳s and 𝛳o : shared parameters and task specific parameters of old tasks
Xn , Yn : data of the new task
𝛳s
𝛳o
Method
Starts With:
𝛳s and 𝛳o : shared parameters and task specific parameters of old tasks
Xn , Yn : data of the new task
Initialize:
Xn
𝛳s
𝛳o
Yo
Training
- Weight decay of 0.0005
- Loss balance weight
Loss - new task
Where, - one-hot ground truth label vector
- softmax output of the new network
Loss - old task
and
Where, l is the number of labels,
is the modified version of recorded and,
is the modified version of current
LwF :
Implementation details:
Xn
𝛳s
𝛳o
Yo
Methods being compared with
Experiments
1)Single new task scenario - add all classes of the new data at once
2)Multiple new task scenario - add a sub group of dataset one by one
3) Influence of dataset size - For a subset of new task data
Experiments
Data
1- Single new task scenario
Observations
Multiple new task scenario
Multiple classes are added in groups to classification task
For this experiment they have split the VOC dataset in to subgroups like animals , places, rooms and gradually train them
Similarly they do it with scenes dataset as well
Multiple new task scenario
Observations
3- Influence of dataset size
Design choices and alternatives
Design choices and alternatives
Additional Experiments
Tracking with MD-NET using LwF
Similar to classification task as we have seen in multiple new scenario, they apply this on the tracking task as a classification task
Incrementally adding new objects for detection is real use case for this LwF technique.
Advantages and Disadvantages
Advantages :
Disadvantages :
Takeaways
Discussion