Can computers REALLY understand human languages?
Who am I?
Overview
What is Natural Language Processing?
NLP is where computer scientists tries to make computer talks and where linguists tries to talk to the computer
Ultimately making the computer produce and understand human language.
What is Natural Language Processing?
altruistic, ideal goal is:
to create talking
androids or cylons
How do humans communicate using language?
Hear
Think
Speak
Write
Read
How do computers communicate using human language?
Hear
Think
Speak
Write
Read
Speech Recognition
Grammar, Semantics,
Knowledge processing, ...
Text-to-Speech Synthesizers
Various Applications
Corpora, Dictionaries,
Ontologies, ...
Siri - Talking to iPhone
Science behind Siri
Language Understanding
Feature Extraction
Word Recognition
Grammatical Structure
Literal meaning of words
Inference, implications, humor, etc
Speech Recognition
(ASR)
Natural Language Generation (NLG)
Speech Synthesis
Generating and Ranking Answers
How to reply appropriately
Science behind Siri
Language Understanding (ASU)
Feature Extraction
Word Recognition
Syntactic Analysis
Semantic Analysis
Pragmatic Analysis
Speech Recognition
(ASR)
Natural Language Generation (NLG)
Speech Synthesis
Answer Generation
Diaglog Systems
Automatic Speech Recognition
ASR is straight-forward, record some sound, analyze and then guess the sound:
Feature Extraction
Spotting patterns (i.e. feature) from a sound wave.
Feature Extraction: IPA
Before pattern spotting, we ask ourselves what types of sounds are there?
Feature Extraction: IPA
Before pattern spotting, we ask ourselves what types of sounds are there?
Strong burst of energy from vocalic sounds
Airy, fuzzy sound from Spirants
Word matching
Ɵ I ʌ ɹ aI v ʌ l o̞ f bɛ n w a l: ɘ s:
Google STT Demo by alvas
If you have google-chrome browser and a linux distro, go to terminal and copy and paste this, and press return:
echo "<input type="text" x-webkit-speech />" >> chrome-speak.html | google-chrome chrome-speak.html
Revived History:
Articulatory synthesis
They seriously built "robots" that tried to imitate human nose, mouth, lips, vocal tract and tongue (e.g. Kempelen's, The Voder, Vocandroids)
Some-time-ago History:
Concatenative Synthesis
They tried asking people to record large chunks and then cut & paste.
�
State-of-art:
Diphone Synthesis (Unit Selection)
Now the Stephen Hawking way...
What is a phone? A sound
What is diphone? 2 sounds lor...
Example:
green day
-g , gr , ri , i: , in , n- , -d , de , ei , i-
State-of-art:
Diphone Synthesis (Unit Selection)
Q: So in English, there are 26 letters in the alphabets so we get 26^2 diphones?
A: No, there are 26 letters but there are ~40 phonemes , so about 1600 diphones.
(click here for Google synthesizer hack by alvas)
Siri's competitor
What is Jeopardy?
You are given the answer/hint and you need to give the question that ask about the
thing / time / person / location / etc...
For example,
A: The person who walks on moon and is crowned the king of pop.
Q: Who is Michael Jackson?
Computers DON'T:
(language structure and meaning)
(question forming)
Computers DON'T:
(world knowledge)
(subtlety in language, pragmatics)
Watson: Grammar and Semantics
Q: How do computers understand our grammar? How do computers knows:
[ [The person] [who walked on moon] ] and [is crowned [the king of pop] ].
A: Using Deep or Shallow processing methods
Deep Linguistic Processing
You have a grammarian or computational linguist sitting down in his office telling the computer,
When you see: (Part of Speech Tagging)
Deep Linguistic Processing
You have a grammarian or computational
linguist sitting down in his office telling the
computer,
When you see:(Combinatory Rules/Constraints)
Shallow Language Processing
Extract a sample from your data, hire some undergrads and tell them to assign a tag to each word:
And then apply what the students tag, feed it to some machine learning software and tagged the rest automatically (MAGIC!!!, yes it is!!!)
Shallow Language Processing
After tagging, we write some simple rules to tell the computer which words can combine with which:
Shallow Language Processing
After tagging, we write some simple rules to tell the computer which words can combine with which:
Beyond Grammar
Now the computer can understand human language structure, how about:
- meaning ???
- world knowledge ???
- humor ???
(there are tonnes of science behind these but for today, i'll just tell you what is required to solve these)
Resources need to go beyond grammar
Resources need to go beyond grammar
Resources need to go beyond grammar
Resources need to go beyond grammar
Resources need to go beyond grammar
Science behind Siri
Language Understanding (ASU)
Feature Extraction
Word Recognition
Syntactic Analysis
Semantic Analysis
Pragmatic Analysis
Speech Recognition
(ASR)
Natural Language Generation (NLG)
Speech Synthesis
Answer Generation
Diaglog Systems
Can computers REALLY understand human languages?
No,
at least not yet
Appendix
ASR Demos to try (in your freetime)
Speech Synthesizers Demo + vids
Computational Meaning
Lexical: each word has a meaning called senses and the combination of senses is the meaning to the sentence
Logical: each word contains a certain mathematical operation and the combination of these operations will generate true or false statements