1 of 60

CS 162: Natural Language Processing

Saadia Gabriel

Lecture 2:

Semantics & Pragmatics Part 1

2 of 60

Announcements

The Bruin Learn course website is visible
Everyone enrolled should be added to Piazza
We may need to shift the timing of Discussion B, an update is coming soon (take poll linked under week 1 on Bruin Learn)

3 of 60

Lecture Outline

2. Introduce Concepts of Lexical and Vector Semantics

1. Pros and Cons of Word Representations

Slides Courtesy of Nanyun (Violet) Peng

4 of 60

How do we represent a word?

How do we “understand” a word?

How can we know the relation/distance/similarity between words computationally?

5 of 60

Representing words as discrete symbols

Naïve way: represent words as atomic symbols: student, talk, university (BoW)

Represent word as a “one-hot” vector �[ 0 0 0 1 0 … 0 ]

egg student talk university happy buy

How large is (what’s the dimension of) this vector?

Vector dimension = number of words in vocabulary

PTB data: ~50k,
Google 1T data: 13M

6 of 60

***Discussion***

Is this a good representation?

7 of 60

Issues?

Dimensionality is large; vector is sparse

No similarity

V_happy = [0 0 0 1 0 ... 0 ]

V_sad = [0 0 1 0 0 ... 0 ]

V_milk = [1 0 0 0 0 ... 0 ]

_V_happy_{• V}_sad_{= V}_happy_{• V}_milk_{= 0}

Cannot represent new words

8 of 60

How about unseen words/phrases

9 of 60

What is Lexical Semantics

Word meanings that can help decide:

Word Similarity

Distributional (Vector) Models of Meaning

Word Relations

Word Sense Disambiguation

Semantic Roles

Distributional Hypothesis (Harris, 1954):

A word is characterized by the company it keeps. In other terms, words that occur in the same contexts tend to have similar meanings.

10 of 60

Intuition of Semantic Similarity

Semantically Close

bank-money
apple-fruit
tree-forest
bank-river
pen-paper
run-walk
mistake-error
car-wheel

Semantically Distant

doctor-mall
painting-January
math-river
apple-penguin
nurse-fruit
pen-river
clown-rocket
car-algebra

11 of 60

Why Are Two Words Related?

Meaning

Two concepts are close in terms of meaning (want-desire)

World knowledge

Two concepts have similar properties, often occur together, or occur in similar contexts (pencil-pen, pen-ink, dog-cat)

Psychology

We often think of the two concepts together (voting-home address, red-luck[in some culture])

12 of 60

Validity of Semantic Similarity

How to approach semantic distance as a valid linguistic phenomenon? How would you approach this problem?

Experiment (Rubenstein and Goodenough, 1965)

Compiled a list of word pairs
Subjects asked to judge semantic distance (from 0 to 4) for each pair

Results

Rank correlation between subjects is ~ 0.9
People are consistent!

13 of 60

***Discussion***

What can we use semantic similarity for?

14 of 60

Word similarity for plagiarism detection

15 of 60

Word similarity for historical linguistics:�semantic change over time

Kulkarni, Al-Rfou, Perozzi, Skiena 2015

16 of 60

Word similarity reflects gender stereotype

327 gender neutral occupations. Project on to she—he direction.

17 of 60

Two classes of similarity algorithms

Thesaurus-based algorithms

Are words “nearby” in a thesaurus hierarchy?
Do words have similar glosses (definitions)?

18 of 60

Distributional algorithms

Do words behave similarly in real-world usage?

Thesaurus-based algorithms

Are words “nearby” in a thesaurus hierarchy?
Do words have similar glosses (definitions)?

Two classes of similarity algorithms

19 of 60

WordNet: Online thesaurus

Developed at Princeton University in 1980s

20 of 60

https://en-word.net/

A hierarchically organized lexical database of English

There are now multilingual Wordnets for 200+ languages: https://globalwordnet.org/resources/wordnets-in-the-world

22 of 60

Terminology: lemma and wordform

A lemma or citation form

Representation of all forms with the same stem, part of speech, rough semantics

A wordform

The inflected word as it appears in text

23 of 60

Lemmas have senses

One lemma “bank” can have many meanings:

Sense 1:

…a bank can hold the investments in a custodial account…

Sense 2:

“…as agriculture burgeons on the east bank the river will shrink even more”

Sense (or word sense)

A discrete representation

of an aspect of a word’s meaning.

24 of 60

Homonymy:

multi-sense as an artifact

Homonyms: words that share a form (spelling or pronunciation) but have unrelated, distinct meanings:

bank₁: financial institution,
bank₂: sloping land

bat₁: club for hitting a ball,

bat₂: nocturnal flying mammal

25 of 60

Homonymy:

multi-sense as an artifact

A related multilingual concept is “false friends,” which have identical or similar forms in 2 languages but have different meanings across languages.

Think “pain” in French vs. “pain” in English.

26 of 60

***Discussion***

Why might homonymy be problematic in real-world applications?

27 of 60

Homonymy causes problems for NLP applications

Information retrieval

“bat care”

Machine Translation

bat: murciélago (animal) or bate (for baseball)

Text-to-Speech

bass (stringed instrument) vs. bass (fish)

28 of 60

Polysemy: related multi-sense

1. The bank was constructed in 1875 out of local red brick.

2. I withdrew the money from the bank

Are those the same sense?

Sense 2: “A financial institution”

Sense 1: “The building belonging to a financial institution”

A polysemous word has related meanings. Most non-rare words have multiple meanings.

29 of 60

How do we know when a word has more than one sense?

30 of 60

Synonyms

Words (different forms) that have the same meaning in some or all contexts.

couch / sofa
big / large
automobile / car
vomit / throw up
Water / H₂0

32 of 60

Antonyms

Senses that are opposites with respect to one feature of meaning

Otherwise, they are very similar!

More formally: antonyms can

define a binary opposition

short/long, fast/slow

dark/light short/long

hot/cold fast/slow

be reversives:

rise/fall, up/down

33 of 60

Hyponymy and Hypernymy

One sense is a hyponym of another if the first sense is more specific, denoting a subclass of the other

car is a hyponym of vehicle

mango is a hyponym of fruit

Conversely hypernym/superordinate (“hyper is super”)

vehicle is a hypernym of car

fruit is a hypernym of mango

34 of 60

Hyponymy more formally

Entailment:

A sense A is a hyponym of sense B if being an A entails being a B

Hyponymy is usually transitive

(A hypo B and B hypo C entails A hypo C)

Another name: the IS-A hierarchy

A IS-A B (or A ISA B)

B subsumes A

35 of 60

Meronymy

The part-whole relation

A leg is part of a chair; a wheel is part of a car.

Wheel is a meronym of car, and car is a holonym of wheel.

36 of 60

How is “sense” defined in WordNet?

The synset (synonym set), the set of near-synonyms, instantiates a sense or concept, with a gloss

Example: chump as a noun with the gloss:

“a person who is gullible and easy to take advantage of”

This sense of “chump” is shared by 9 words:

chump¹, fool², gull¹, mark⁹, patsy¹, fall guy¹, sucker¹, soft touch¹, mug²

37 of 60

Senses of “bass” in Wordnet

38 of 60

WordNet Hypernym Hierarchy for “bass”

39 of 60

Word Similarity

Synonymy: a binary relation

Two words are either synonymous or not

Similarity (or distance): a looser metric

Two words are more similar if they share more features of meaning

Similarity is properly a relation between senses

It’s not the word “bank” that is similar to the word “slope”
Rather, Bank¹ is similar to fund³
Bank² is similar to slope⁵

But we can compute similarity over both words and senses!

40 of 60

Two classes of similarity algorithms

41 of 60

Path based similarity

Two concepts (senses/synsets) are similar if they are near each other in the thesaurus hierarchy

=have a short path between them
concepts have path 1 to themselves

42 of 60

Refinements to path-based similarity

43 of 60

Example: path-based similarity�simpath(c₁,c₂) = 1/pathlen(c₁,c₂)

44 of 60

Thesaurus Methods: Limitations

Measure is only as good as the resource

Missing nuances (e.g., good vs. proficient)
Missing new concepts/new meanings of words

Limited in scope

Assumes IS-A relations
Works mostly for nouns

Role of context not accounted for
Not easily domain-adaptable
Resources not available in many languages

46 of 60

Theoretical foundation of distributional semantics

Intuitions: Zellig Harris (1954):

“oculist and eye-doctor … occur in almost the same environments”

“If A and B have almost identical environments we say that they are synonyms.”

47 of 60

Intuition for distributional word similarity

Words that occur in the same contexts tend to have similar meanings

48 of 60

More intuition for distributional word similarity

A bottle of tesgüino is on the table

Everybody likes tesgüino

Tesgüino makes you drunk

We make tesgüino out of corn.

Let’s use this observation to define a new similarity algorithm.

From context words humans can guess tesgüino means an alcoholic beverage like beer

49 of 60

Modeling words with vectors

Model the meaning of a word by “embedding” in a vector space.

The meaning of a word is a vector of numbers

Vector models are also called “embeddings”

Contrast: previously, word meaning is represented by a vocabulary index (“word number 545” → one hot vector)

50 of 60

Two classes of vector representation

51 of 60

Term-document matrix

52 of 60

Term-document matrix

53 of 60

The words in a term-document matrix

54 of 60

The words in a term-document matrix

55 of 60

Issues about the term-document matrix

56 of 60

The word-word or word-context matrix

What dimensions do we have now?

57 of 60

The word-word or word-context matrix

Note: Very sparse! (~ 50,000 x 50,000)

We know the meanings are similar because of similar contexts

58 of 60

Word-word matrix

59 of 60

Problem with raw counts

60 of 60

Next Monday…

Mutual Information (X ; Y): measures how the information captured by a variable X decreases our uncertainty about a variable Y.

1 of 60

2 of 60

3 of 60

4 of 60

5 of 60

6 of 60

7 of 60

8 of 60

9 of 60

10 of 60

11 of 60

12 of 60

13 of 60

14 of 60

15 of 60

16 of 60

17 of 60

18 of 60

19 of 60

20 of 60

21 of 60

22 of 60

23 of 60

24 of 60

25 of 60

26 of 60

27 of 60

28 of 60

29 of 60

30 of 60

31 of 60

32 of 60

33 of 60

34 of 60

35 of 60

36 of 60

37 of 60

38 of 60

39 of 60

40 of 60

41 of 60

42 of 60

43 of 60

44 of 60

45 of 60

46 of 60

47 of 60

48 of 60

49 of 60

50 of 60

51 of 60

52 of 60

53 of 60

54 of 60

55 of 60

56 of 60

57 of 60

58 of 60

59 of 60

60 of 60