Word Meaning and Similarity
Word Senses and Word Relations
Dan Jurafsky
Reminder: lemma and wordform
Wordform | Lemma |
banks | bank |
sung | sing |
duermes | dormir |
Dan Jurafsky
Lemmas have senses
of an aspect of a word’s meaning.
1
2
Sense 1:
Sense 2:
Dan Jurafsky
Homonymy
Homonyms: words that share a form but have unrelated, distinct meanings:
Dan Jurafsky
Homonymy causes problems for NLP applications
Dan Jurafsky
Polysemy
Dan Jurafsky
Metonymy or Systematic Polysemy: �A systematic relationship between senses
Author (Jane Austen wrote Emma)
Works of Author (I love Jane Austen)
Tree (Plums have beautiful blossoms)
Fruit (I ate a preserved plum)
Dan Jurafsky
How do we know when a word has more than one sense?
Dan Jurafsky
Synonyms
Dan Jurafsky
Synonyms
Dan Jurafsky
Synonymy is a relation �between senses rather than words
Dan Jurafsky
Antonyms
dark/light short/long fast/slow rise/fall
hot/cold up/down in/out
or be at opposite ends of a scale
Dan Jurafsky
Hyponymy and Hypernymy
Superordinate/hyper | vehicle | fruit | furniture |
Subordinate/hyponym | car | mango | chair |
Dan Jurafsky
Hyponymy more formally
Dan Jurafsky
Hyponyms and Instances
15
Dan Jurafsky
Word Meaning and Similarity
Word Senses and Word Relations
Dan Jurafsky
Word Meaning and Similarity
WordNet and other Online Thesauri
Dan Jurafsky
Applications of Thesauri and Ontologies
Dan Jurafsky
WordNet 3.0
Category | Unique Strings |
Noun | 117,798 |
Verb | 11,529 |
Adjective | 22,479 |
Adverb | 4,481 |
Dan Jurafsky
Senses of “bass” in Wordnet
Dan Jurafsky
How is “sense” defined in WordNet?
“a person who is gullible and easy to take advantage of”
chump1, fool2, gull1, mark9, patsy1, fall guy1, sucker1, soft touch1, mug2
Dan Jurafsky
WordNet Hypernym Hierarchy for “bass”
Dan Jurafsky
WordNet Noun Relations
Dan Jurafsky
WordNet 3.0
Dan Jurafsky
MeSH: Medical Subject Headings�thesaurus from the National Library of Medicine
Entry Terms: Eryhem, Ferrous Hemoglobin, Hemoglobin
Definition: The oxygen-carrying proteins of ERYTHROCYTES. They are found in all vertebrates and some invertebrates. The number of globin subunits in the hemoglobin quaternary structure differs between species. Structures range from monomeric to a variety of multimeric arrangements
Synset
Dan Jurafsky
The MeSH Hierarchy
26
Dan Jurafsky
Uses of the MeSH Ontology
Dan Jurafsky
Word Meaning and Similarity
WordNet and other Online Thesauri
Dan Jurafsky
Word Meaning and Similarity
Word Similarity: Thesaurus Methods
Dan Jurafsky
Word Similarity
Dan Jurafsky
Why word similarity
Dan Jurafsky
Word similarity and word relatedness
Dan Jurafsky
Two classes of similarity algorithms
Dan Jurafsky
Path based similarity
Dan Jurafsky
Refinements to path-based similarity
c1∈senses(w1),c2∈senses(w2)
Dan Jurafsky
Example: path-based similarity�simpath(c1,c2) = 1/pathlen(c1,c2)
simpath(nickel,coin) = 1/2 = .5
simpath(fund,budget) = 1/2 = .5
simpath(nickel,currency) = 1/4 = .25
simpath(nickel,money) = 1/6 = .17
simpath(coinage,Richter scale) = 1/6 = .17
Dan Jurafsky
Problem with basic path-based similarity
Dan Jurafsky
Information content similarity metrics
Resnik 1995. Using information content to evaluate semantic similarity in a taxonomy. IJCAI
Dan Jurafsky
Information content similarity
of natural elevation, geological formation, entity, etc
geological-formation
shore
hill
natural elevation
coast
cave
grotto
ridge
…
entity
Dan Jurafsky
Information content similarity
D. Lin. 1998. An Information-Theoretic Definition of Similarity. ICML 1998
Dan Jurafsky
Information content: definitions
IC(c) = -log P(c)
LCS(c1,c2) =
The most informative (lowest) node in the hierarchy subsuming both c1 and c2
Dan Jurafsky
Using information content for similarity: the Resnik method
(lowest) subsumer (MIS/LCS) of the two nodes
Philip Resnik. 1995. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. IJCAI 1995.
Philip Resnik. 1999. Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. JAIR 11, 95-130.
Dan Jurafsky
Dekang Lin method
Dekang Lin. 1998. An Information-Theoretic Definition of Similarity. ICML
Dan Jurafsky
Dekang Lin similarity theorem
Dan Jurafsky
Lin similarity function
Dan Jurafsky
The (extended) Lesk Algorithm
Dan Jurafsky
Summary: thesaurus-based similarity
Dan Jurafsky
Libraries for computing thesaurus-based similarity
48
Dan Jurafsky
Evaluating similarity
Levied is closest in meaning to:
imposed, believed, requested, correlated
Dan Jurafsky
Word Meaning and Similarity
Word Similarity: Thesaurus Methods
Dan Jurafsky
Word Meaning and Similarity
Word Similarity: Distributional Similarity (I)
Dan Jurafsky
Problems with thesaurus-based meaning
Dan Jurafsky
Distributional models of meaning
53
Dan Jurafsky
Intuition of distributional word similarity
A bottle of tesgüino is on the table
Everybody likes tesgüino
Tesgüino makes you drunk
We make tesgüino out of corn.
Dan Jurafsky
Reminder: Term-document matrix
55
Dan Jurafsky
Reminder: Term-document matrix
56
Dan Jurafsky
The words in a term-document matrix
57
Dan Jurafsky
The words in a term-document matrix
58
Dan Jurafsky
The Term-Context matrix
59
Dan Jurafsky
Sample contexts: 20 words (Brown corpus)
60
Dan Jurafsky
Term-context matrix for word similarity
61
Dan Jurafsky
Should we use raw counts?
62
Dan Jurafsky
Pointwise Mutual Information
Dan Jurafsky
Computing PPMI on a term-context matrix
64
Dan Jurafsky
p(w=information,c=data) =
p(w=information) =
p(c=data) =
65
= .32
6/19
11/19
= .58
7/19
= .37
Dan Jurafsky
66
.32 /
(.37*.58) )
= .58
(.57 using full precision)
Dan Jurafsky
Weighing PMI
67
Dan Jurafsky
68
Dan Jurafsky
69
Dan Jurafsky
Word Meaning and Similarity
Word Similarity: Distributional Similarity (I)
Dan Jurafsky
Word Meaning and Similarity
Word Similarity: Distributional Similarity (II)
Dan Jurafsky
Using syntax to define a word’s context
Modified by adjectives | additional, administrative, assumed, collective, congressional, constitutional … |
Objects of verbs | assert, assign, assume, attend to, avoid, become, breach … |
Dan Jurafsky
Co-occurrence vectors based on syntactic dependencies
Dekang Lin, 1998 “Automatic Retrieval and Clustering of Similar Words”
Dan Jurafsky
PMI applied to dependency relations
Object of “drink” | Count | PMI |
it | 3 | 1.3 |
anything | 3 | 5.2 |
wine | 2 | 9.3 |
tea | 2 | 11.8 |
liquid | 2 | 10.5 |
Hindle, Don. 1990. Noun Classification from Predicate-Argument Structure. ACL
Object of “drink” | Count | PMI |
tea | 2 | 11.8 |
liquid | 2 | 10.5 |
wine | 2 | 9.3 |
anything | 3 | 5.2 |
it | 3 | 1.3 |
Dan Jurafsky
Reminder: cosine for computing similarity
Dot product
Unit vectors
vi is the PPMI value for word v in context i
wi is the PPMI value for word w in context i.
Cos(v,w) is the cosine similarity of v and w
Sec. 6.3
Dan Jurafsky
Cosine as a similarity metric
76
Dan Jurafsky
77
| large | data | computer |
apricot | 1 | 0 | 0 |
digital | 0 | 1 | 2 |
information | 1 | 6 | 1 |
Which pair of words is more similar?
cosine(apricot,information) =
cosine(digital,information) =
cosine(apricot,digital) =
Dan Jurafsky
Other possible similarity measures
Dan Jurafsky
Evaluating similarity �(the same as for thesaurus-based)
Levied is closest in meaning to which of these:
imposed, believed, requested, correlated
Dan Jurafsky
Word Meaning and Similarity
Word Similarity: Distributional Similarity (II)
Dan Jurafsky