Minor sparse coding update
The updated work presented here was carried out primarily in
January and February 2023.
Lee Sharkey, Dan Braun, Beren Millidge
Toy data experimental setup (recap)
Data
Reconstruction
Dictionary
Toy data experimental setup (recap)
Yellow = Ground truth features recovered
Pink/purple = Ground truth features not recovered
Mean Max Cosine Similarity (MMCS) between learned dictionary and ground truth features:
Previously….
Interim report showed substantial differences between toy data and language model data
1: Num of dead neurons
2: Loss stickiness
3: Similarity of learned dictionaries with dicts larger than it
Toy data
Language model data
A few new results
A few changes compared with interim report
A small update (Toy data)
Yellow = Ground truth features recovered
Pink/purple = Ground truth features not recovered
Mean Max Cosine Similarity (MMCS) between learned dictionary and ground truth features:
New results
1: Num of dead neurons
2: Loss stickiness (deprecated)
3: Similarity of learned dictionaries with dicts larger than it
Toy data
Language model data
What are the features?
Dataset examples that maximally activate particular dictionary elements�
�Example: Feature 324
What are the features? (Random sample)
(See interface for more dataset examples)
Next steps