1 of 12

Overcoming catastrophic forgetting in neural networks

14.3.2017

Vit Listik

2 of 12

Overview

  • Neural nets
  • Catastrophic forgetting
  • Different tasks
  • Sequential learning
  • Selectively slowing down learning
  • Used approach from nature

3 of 12

Task

  • Limiting AGI development
  • Mammals are great at this task
    • Higher level knowledge encoded in less plastic synapses (synaptic consolidation)
  • Train one NN for several tasks sequentially

4 of 12

Solution

  • Elastic weight consolidation (EWC)
    • Constraint: quadratic penalty
    • Limited to some weights
    • Conditional probability:
    • Laplace approximation)
    • Gaussian distribution
    • Alpha - task importance
    • Lb - loss for task B
    • Fisher information matrix F

5 of 12

Data

  • MNIST
  • ATARI games

6 of 12

Mnist

  • Fixed size FFNN
  • 10 tasks
  • Hyperparameters - May be tweaked for 2 tasks
  • Methods:
    • SGD
    • SGD + Dropout
    • SGD + L2
    • EWC

7 of 12

Results (MNIST)

8 of 12

Atari games (DQN)

  • 10 games
  • Old solutions
    • Each NN trained separately + used for training single network
    • Adding capacity to network
  • EWC
    • Task context = latent variable of a Hidden Markov Model
    • Short-term memory buffers for each inferred task
    • Small number of game specific params
    • Reinforcement learning
    • CNN

9 of 12

Results (Atari)

10 of 12

Results (Atari)

11 of 12

Feature extraction?

Shared and separate features for tasks

12 of 12

References