A broad-coverage semantic �classification of the English clause-embedding lexicon
Aaron Steven White
University of Rochester
MECORE Kickoff Workshop
University of Edinburgh
21 October 2021
Slides
aaronstevenwhite.io
Data + Code
megaattitude.io
Kyle Rawlins
Johns Hopkins University
Ellise Moon
University of Rochester
Hannah An
University of Rochester
Ben Kane
University of Rochester
Will Gantt
University of Rochester
Collaborators
Ben Kane
University of Rochester
Will Gantt
University of Rochester
Overarching Question�What are the components that clause-taking predicates' semantic values are built from?
Subquestion #1�Which inferences triggered by sentences containing clause-embedding are associated with lexical information?
Jo hated that Bo left. ⇝ Bo left.
Veridicality inference
NP V S ⇝ S
Subquestion #2�Given a set of inference types, which possible inferential patterns associated with lexical items are attested?
Jo hated that Bo left. ⇝ Bo left.
NP V S ⇝ S
Veridicality inference
Jo hated that Bo left. ⇝ Jo believed Bo left.
Doxastic inference
NP V S ⇝ NP believe S
Jo hated that Bo left. ⇝ Jo didn't want Bo to have left.
Bouletic inference
NP V S ⇝ NP not want S
Subquestion #2�Given a set of inference types, which possible inferential patterns associated with lexical items are attested?
Predicate | NP V S ⇝ S | NP V S ⇝ NP believe S | NP V S ⇝ NP want S |
think | 0 | + | 0 |
doubt | 0 | - | 0 |
hope | 0 | 0 | + |
hate | + | + | - |
Theoretical Import�Gaps in attested patterns potentially suggest deep constraints on lexicalization.
Horn 1972, Barwise & Cooper 1981, Levin & Rappaport Hovav 1991, a.o.
Goals for today's talk
Which logically possible inference patterns are both attested and predictive of syntactic distribution?
Approach
NP V S ⇝ S
Veridicality inference
Doxastic inference
NP V S ⇝ NP believe S
Bouletic inference
NP V S ⇝ NP want S
NP not V S ⇝ (not) S
NP not V S ⇝ NP (not) believe S
NP not V S ⇝ NP (not) want S
(not)
(not)
(not)
Neg-raising inference
NP not V S ⇝ NP V not S
Roadmap
Measuring distribution
MegaAcceptability dataset
Acceptability for 1,000 verbs in 50 syntactic frames focused on clause-embedding.
White & Rawlins 2016, 2020
think
know
wonder
love
surprise
tell
say
start
stop
...
Verbs
Bleaching method
Frame templates (e.g. NP __ that S) instantiated by semantically bleached fillers.
Someone __ed something happened
Someone __ed that something happened
Someone __ed whether something happened
Someone __ed which someone something happened
Someone __ed someone that something happened
Someone __ed someone whether something happened
Someone __ed to someone that something happened
Someone __ed to do something
Someone __ed someone to do something
...
think
know
wonder
love
surprise
tell
say
start
stop
...
x
Verbs
Frames
50,000 total items x 5 judgments per item
MegaAcceptability dataset
Acceptability for 1,000 verbs in 50 syntactic frames focused on clause-embedding.
White & Rawlins 2016, 2020
Question
Is bleaching a valid method for capturing the acceptability of a verb in a frame?
Validation Strategy
Compare judgments for bleached items against judgments from trained linguists.
Validation data
Comparison
Correlation between judgments from LI and Sprouse et al.'s (2013) dataset
Sprouse Linguistic Inquiry
MegaAcceptability
Correlation
Conclusion
Safe to use bleaching to collect acceptabiliy judgments focused on capturing selection.
Important Point
Be cautious in using this dataset to investigate individual predicates.
Measuring inference
NP V S ⇝ S
Veridicality inference
Doxastic inference
NP V S ⇝ NP believe S
Bouletic inference
NP V S ⇝ NP want S
NP not V S ⇝ (not) S
NP not V S ⇝ NP (not) believe S
NP not V S ⇝ NP (not) want S
(not)
(not)
(not)
Neg-raising inference
NP not V S ⇝ NP V not S
Recipe
Someone was irritated that a particular thing happened.
Did that thing happen?
no maybe or maybe not yes
Veridicality task
White & Rawlins 2018
Someone {knew, didn't know} that a particular thing happened.
NP _ that S
Someone {was, wasn't} surprised that a particular thing happened.
NP be _ that S
Someone {needed, didn’t need} for a particular thing to happen.
NP _ for NP to VP
Someone {told, didn’t tell} a particular person to do a particular thing.
Someone {believed, didn’t believe} a particular person to have a particular thing.
NP _ NP to VP[+/-eventive]
A particular person {was, wasn’t} excited to do a particular thing.
A particular person {was, wasn’t} suspected to have a particular thing.
NP be _ to VP[+/-eventive]
A particular person {managed, didn’t manage} to do a particular thing.
A particular person {seemed, didn’t seem} to have a particular thing.
NP _ to VP[+/-eventive]
If I were to say I don’t think that a particular thing happened, how likely is it that I mean I think that that thing didn’t happen?
Neg-raising task
Extremely unlikely
Extremely likely
An & White 2020
know that a particular thing happened.
NP _ that S
A particular person {didn’t, doesn’t}
I {didn’t, don’t}
surprised that a particular thing happened.
NP be _ that S
A particular person {wasn’t, isn’t}
I {wasn’t, ‘m not}
told to do a particular thing.
believed to have a particular thing.
NP be _ to VP[+/-eventive]
A particular person {wasn’t, isn’t}
I {wasn’t, ‘m not}
managed to do a particular thing.
seemed to have a particular thing.
NP _ to VP[+/-eventive]
A particular person {didn’t, doesn’t}
I {didn’t, don’t}
If A knew that C happened, how likely is it that A believed that C happened?
Doxastic task
Extremely unlikely
Extremely likely
Kane et al. 2021
If A persudaded B that C happened, how likely is it that B believed that C happened?
Doxastic task
Extremely unlikely
Extremely likely
Kane et al. 2021
If A was appalled that C happened, how likely is it that A wanted C to have happened?
Bouletic task
Extremely unlikely
Extremely likely
Kane et al. 2021
If A apologized to B that C happened, how likely is it that B wanted C to have happened?
Bouletic task
Extremely unlikely
Extremely likely
Kane et al. 2021
A {knew, didn't know} that C happened.
NP _ that S
A {told, didn't tell} B that C happened.
NP _ NP that S
A {said, didn't say} to B that C happened.
NP _ to NP that S
A {was, wasn’t} surprised that C happened.
NP _ that S
A {hoped, didn't hope} that C would happen.
NP _ that S[+future]
A {promised, didn't promise} B that C would happen.
NP _ NP that S[+future]
A {predicted, didn't predict} to B that C would happen.
NP _ to NP that S[+future]
A {was, wasn’t} excited that C would happen.
NP _ that S[+future]
Question
Is bleaching a valid method for capturing inferences associated with verb in a frame?
Validation Strategy #1
Compare judgments for bleached items against judgments from trained linguists.
| Neg-raising | Non-neg-raising |
NP __ that S | think, believe, feel, reckon, figure, guess, suppose, imagine | announce, claim, assert, report, know, realize, notice, find out |
NP __ to VP | want, wish, happen, seem, plan, intend, mean, turn out | love, hate, need, continue, try, like, desire, decide |
Non-neg-raising
Neg-raising
Mean rating of bleached example
Validation Strategy #1
Compare judgments for bleached items against judgments from trained linguists.
Validation Strategy #2
Compare judgments for bleached items to judgments for more contentful items.
Implementation
For each verb-frame pair in validation set, sample five items from corpus.
Mean rating of corpus example
Mean rating of bleached example
r = 0.8
(p < 0.001)
Validation Strategy #1
Compare judgments for bleached items against judgments from trained linguists.
Validation Strategy #2
Compare judgments for bleached items to judgments for more contentful items.
Validation Strategy #3
Compare inference judgments for bleached items to acceptability judgments for established distributional diagnostic.
Implementation
For each verb-frame pair in validation set, collect acceptability of strong NPI (additive either).
Jo didn’t do a particular thing, and…
…I think that Bo didn’t do that thing either.
…I don’t think that Bo did that thing either.
Mean rating of bleached example
Mean acceptability of strong NPI
r = 0.77
(p < 0.001)
Conclusion
Safe to use bleaching to collect at least these types of inference judgments.
Important Point (again)
Be cautious in using this dataset to investigate individual predicates.
Discovering inference patterns
Approach
Cluster predicate-frame pairs in inference space using a multiview mixed effects mixture model.
Predicate | NP V S ⇝ S | NP V S ⇝ NP believe S | NP V S ⇝ NP want S |
think | 0 | + | 0 |
doubt | 0 | - | 0 |
hope | 0 | 0 | + |
hate | + | + | - |
know + NP _ that S
1
2
3
4
5
6
7
8
9
10
11
12
Inference patterns
1
0
1
0
1
0
Doxastic
Bouletic
no
maybe
yes
Veridicality
Neg-raising
know + NP _ that S
1
2
3
4
5
6
7
8
9
10
11
12
Inference patterns
1
0
1
0
1
0
Doxastic
Bouletic
no
maybe
yes
Veridicality
Neg-raising
Finding clusters
Fit model to raw that-clause data in MegaVeridicality, MegaNegRaising, and MegaIntensionality using variational inference.
Output
know + NP _ that S
1
2
3
4
5
6
7
8
9
10
11
12
Inference patterns
1
0
1
0
1
0
Doxastic
Bouletic
no
maybe
yes
Veridicality
Neg-raising
Output
know + NP _ that S
1
2
3
4
5
6
7
8
9
10
11
12
Inference patterns
1
0
1
0
1
0
Doxastic
Bouletic
no
maybe
yes
Veridicality
Neg-raising
Question
How many inference patterns should we assume there are?
Idea
Only as many as we need to explain syntactic distribution.
Implementation
Select the smallest clustering for which no larger clustering improves prediction of the judgments in MegaAcceptability.
Cluster
Predicate
Cluster
Frame
Predicate
Implementation
Select the smallest clustering for which no larger clustering improves prediction of the judgments in MegaAcceptability.
Result
Optimal number of inference patterns is 15.
Interpretation
There are at least 15 distributionally correlated inference patterns.
Important Point #2
Enriching the distributional representation could increase the granularity of the patterns.
Important Point #1
Not all inference patterns instantiated by particular predicates will get their own inference pattern.
Investigating inference patterns
know + NP _ that S
1
2
3
4
5
6
7
8
9
10
11
12
Inference patterns
1
0
1
0
1
0
Doxastic
Bouletic
no
maybe
yes
Veridicality
Neg-raising
0
0.5
1
Predicate
Cluster
Frame
Predicate
Representiationals
doxastic mental states and mental processes
NP {thought, believed, suspected} that S
Preferentials
expressions of preference for a (future) situation.
NP {hoped, wished, demanded, recommended} that S[+/-future]
Positive internal emotives
positive emotional states
A was {pleased, thrilled, enthused} that C happened.
Preferentials
expressions of preference for a (future) situation.
NP {hoped, wished, demanded, recommended} that S[+/-future]
Negative emotive miratives
expressions of surprise with negative valence
NP was {dazed, flustered, alarmed} that S[+future].
Negative external emotives
expressions of negative emotion with behavioral correlates
NP {whined, whimpered, pouted} to NP that S[+future].
Positive external emotives
expressions of positive emotion with behavioral correlates
NP was {congratulated, praised, fascinated} that S.
Positive internal emotives
positive emotional states
NP was {pleased, thrilled, enthused} that S.
Preferentials
expressions of preference for a (future) situation.
NP {hoped, wished, demanded, recommended} that S[+future/-tense]
Negative internal emotives
negative emotional states
NP was {frightened, disgusted, infuriated} that S.
Representiationals
doxastic mental states and mental processes
NP {thought, believed, suspected} that S
Speculatives
communication of uncertain beliefs.
NP {ventured, guessed, gossiped} that S
Future commitment
expressions of commitment to future action or result.
NP {promised, ensured, attested} S[+future]
Weak communicatives
communicative acts with weak doxastic inferences about the source.
NP {reported, remarked, yelped} to NP that S
Representiationals
doxastic mental states and mental processes
NP {thought, believed, suspected} that S
Speculatives
communication of uncertain beliefs.
NP {ventured, guessed, gossiped} that S
Future commitment
expressions of commitment to future action or result.
NP {promised, ensured, attested} S[+future]
Strong communicatives
communicative acts with strong doxastic inferences about the source.
NP {confessed, admitted, acknowledged} that S
Discourse commitment
communicative acts committing the source to the content’s truth.
A {maintained, remarked, swore} that C would happen.
Negative emotive miratives
expressions of surprise with negative valence
A was {dazed, flustered, alarmed} that C would happen.
Negative external emotives
expressions of negative emotion with behavioral correlates
A {whined, whimpered, pouted} to B that C would happen.
Positive external emotives
expressions of positive emotion with behavioral correlates
A was {congratulated, praised, fascinated} that C happened.
Positive internal emotives
positive emotional states
A was {pleased, thrilled, enthused} that C happened.
Preferentials
expressions of preference for a (future) situation.
NP {hoped, wished, demanded, recommended} that S[+/-future]
Negative internal emotives
negative emotional states
A was {frightened, disgusted, infuriated} that C happened.
Negative emotive communicatives
communicative acts with broadly negative valence.
A {screamed, ranted, growled} to B that C would happen.
Weak communicatives
communicative acts with weak doxastic inferences about the source.
NP {reported, remarked, yelped} to NP that S
Representiationals
doxastic mental states and mental processes
NP {thought, believed, suspected} that S
Speculatives
communication of uncertain beliefs.
NP {ventured, guessed, gossiped} that S
Future commitment
expressions of commitment to future action or result.
NP {promised, ensured, attested} S[+future]
Strong communicatives
communicative acts with strong doxastic inferences about the source.
NP {confessed, admitted, acknowledged} that S
Deceptives
actions involving dishonesty, deceit, or pretense.
NP {lied, misled, faked, fabricated} ((to) NP) that S.
Discourse commitment
communicative acts committing the source to the content’s truth.
NP{maintained, remarked, swore} that S[+future].
Interpretation
There are at least 15 distributionally correlated inference patterns.
Important Point #2
Enriching the distributional representation could increase the granularity of the patterns.
Important Point #1
Not all inference patterns instantiated by particular predicates will get their own inference pattern.
Interpretation
There are at least 15 distributionally correlated inference patterns.
Important Point #2
Enriching the distributional representation could increase the granularity of the patterns.
Important Point #1
Not all inference patterns instantiated by particular predicates will get their own inference pattern.
Conclusion
Overarching Question�What are the components that clause-taking predicates' semantic values are built from?
Current Directions�How do we discover the underlying representational components?
Subdirection #1�Decomposition of the inference patterns themselves.
Correlations across inference types
Correlations across inference patterns
Subdirection #1�Decomposition of the inference patterns themselves.
Subdirection #2�Decomposition of the relationship between inference patterns and syntactic distribution.
Relationship between inference patterns and syntax
Correlations across syntactic structures
Subdirection #1�Decomposition of the inference patterns themselves.
Subdirection #2�Decomposition of the relationship between inference patterns and syntactic distribution.
Subdirection #3�Decomposition of the relationship between inference patterns and lexical items.
Correlations across predicates
Predicate
Cluster
Possible Unified Approach�Multi-task combinatory categorial grammar induction with structured denotation decoders
Gene Kim
University of Rochester
Thanks!
Supported by NSF-BCS-1748969
The MegaAttitude Project: Investigating selection and polysemy at the scale of the lexicon
Appendix A:�Further Validation of MegaAcceptability
Case Study�The vast majority of about-PPs are adjuncts�
Rawlins 2013, 2014
XP1 V (XP2) (XP3) about XP4
is acceptable
XP1 V (XP2) (XP3)
is acceptable
X
NP _ed
NP _ed about XP
Rawlins 2014
NP _ed
NP _ed about XP
NP _ed
NP _ed about XP
NP _ed
NP _ed about XP
NP _ed
NP _ed about XP
NP _ed
NP _ed about XP
NP _ed
NP _ed about XP
Noise variance / acceptability variance
Proportion violations
Independence
NP (was) _ed
NP (was) _ed about whether S
NP (was) _ed about whether S
NP (was) _ed
NP (was) _ed about whether S
NP (was) _ed
NP (was) _ed about whether S
NP (was) _ed
Acceptability threshold
Proportion violations
Noise variance / acceptability variance
Proportion violations
Independence
Acceptability threshold
Proportion violations
Appendix D:�Distribution of Inference Judgments
Appendix C:�Validation of MegaIntensionality
Question
Is bleaching a valid method for capturing doxastic and bouletic inferences associated with verb in a frame?
Challenge
Doxastic and bouletic inferences are highly sensitive to world knowledge.
Jo doubts that Bo left. ⇝ Jo doesn't believe that Bo left.
Jo doubts that Bo left. ⇝ Jo wants Bo to have left.
Trump doubts that he won in 2020.
Trump wants to have won in 2020.
Approach
Executives generally want their deals to go through.
Executives generally believe that their deals will go through.
Norming
Approach
Executives generally want their deals to go through.
Executives generally believe that their deals will go through.
Norming
The executive knew that his deal had gone through.
Contentful
Approach
Executives generally want their deals to go through.
Executives generally believe that their deals will go through.
Norming
The executive knew that his deal had gone through.
Contentful
A knew that C happened.
Bleached
Appendix D:�Number of possible inference patterns
(3 veridicality inferences)2 matrix polarities
x
(3 doxastic inferences)2 matrix polarities
x
(3 bouletic inferences)2 matrix polarities
x
2 neg-raising inferences
=
1,458 inference patterns
If any lexical knowledge relevant to any inference type is gradient (and continuous), there are an uncountable number of patterns.
Appendix E:�Principal Component Analysis
95% of variance