Meta Learned Conversational Neural Memory in a Transformer Based Chatbot
Mustafa Abdool (Moose) Carson Lam
Possible obstacles to Human Level AI
Expectation:
Me: uhh . . . Hi
Bot: Hey, what’s your name?
Me: Carson
Bot: That’s a weird name
Me: What is my name?
Bot: it’s Carson, you just told me
Reality:
Me: My dog’s name is Fanny. Fanny is my dog.
bot: That is a beautiful name for a dog
Me: What is my dog’s name?
bot: I would not know that
Me: Fanny is my dog’s name
bot: How old is your dog?
Me: What is my dog’s name?
bot: It’s a secret
Short Term and Linearly Accumulating Memory Solutions
Complex Decision Trees connected to a Traditional Database
Other ways to solve the coreference resolution problem and to build conversational memory in real time
External, Compressed, Long Term Memory
What This Is
Some AI tasks will require an agent to store neural memory with long
term persistence, decoupled from working memory, protected from unrelated
current computation, but accessible at the time of relevance.
so called “External Memory”.
The type that hypothetically would be appropriate for lifelong learning
Differentiable Neural Computer (Graves A, et al 2016)
Meta Learned Neural Memory
Meta-Learned Memory
Agent
What’s crazy dude’s name again?
It’s Carson
“carson”
“crazy dude’s name”
MNM Training
Agent
What’s that crazy dude’s name again?
It’s Carson
“carson”
“crazy dude’s name”
s_t, agents representation of the state
m_t, memory’s response to the state
h_t = controller(s_t,h_t-1)
Perceptron Learning Rule for Gradient-Free Fast Write to Memory @ test time
The meta-learning task during training is to learn the best B & z’ for immediately storing or not storing new memories
in-training
output of simple short term coreference resolution problem
> my name is adam
> hi adam
> what is my name?
> its adam
> my name is vicki
> hi vicki
> what is my name?
> its vicki
> my name is chloe
> hi chloe
> what is my name?
> its chloe
> my name is moose
> hi moose
> what is my name?
> its moose
Next Steps
Say ‘hi carson’ at:
References
Alex Graves, Greg Wayne, and Ivo Danihelka. Neural turing machines. arXiv preprint arXiv:1410.5401, 2014
Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka GrabskaBarwinska, Sergio Gómez Colmenarejo, Edward Grefenstette, Tiago Ramalho, John Agapiou, ´ et al. Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626):471, 2016
Yan Wu, Gregory Wayne, Karol Gregor, and Timothy Lillicrap. Learning attractor dynamics for generative memory. In Advances in Neural Information Processing Systems, pages 9401–9410, 2018.
Yan Wu, Greg Wayne, Alex Graves, and Timothy Lillicrap. The kanerva machine: A generative distributed memory. ICLR 2018, 2018.