Explainability for Large Language Models
- Focus on explaining internal mechanisms of LLMs
- Why they behave the way they do?
Emergent Abilities of Large Language Models
-Deepmind
-Stanford
Emergent
- An ability is emergent if it is not present in smaller models but is present in larger models
- Emergent abilities can't be predicted by extrapolating a scaling law
Larger models:-
Results
Are Emergent Abilities of Large Language Models a Mirage?
-Stanford
Hypothesis
- No emergent abilities of LLMs
Metrics used:-
"Exact String Match" accuracy
True string: The sun set behind the mountains.
Candidate #1: A lazy dog
0
Candidate #2: A sun set
0
Candidate #3: The sun set in the hills
0
Candidate #4: The sun set behind the mountains.
1
String edit distance
Minimum operations needed to make string s1 equal to string s2.
Operations allowed:-
New accuracy
True string: The sun set behind the mountains.
Candidate #1: A lazy dog
6
Candidate #2: A sun set
4
Candidate #3: The sun set in the hills
2
Candidate #4: The sun set behind the mountains.
0
Results
Current literature
Few papers exist that try to understand the internal mechanisms of LLMs/Transformers for Graphs.
Focus on explaining:-