Measuring productivity through language comparison
Panel on ‘Syntactic Productivity’ | GAC 2022
Bert Le Bruyn and colleagues
Utrecht University
2
> Reflections on productivity based on insights from the NWO-funded Time in Translation project.
> The goal of the project is to write a cross-linguistically stable semantics of a single construction: have/be in the present tense, followed by a past participle.
> To do justice to our construction-based view, we consistently refer to it as have/be + pp.
> In English grammars, this construction is known as the Present Perfect, in Dutch grammars as the VTT, etc.
> Translation corpora have been at the methodological heart of the project (take a look at our interface @ tinyurl.com/tint4u).
> Examples from our L’Étranger corpus.
Towards an extra dimension of productivity?
4
Towards an extra dimension of productivity?
Productivity – in syntax – refers to the range of lexical items that may fill the slots of constructions. (Perek 2016; Van den Heede et al. this panel)
Two empirical observations
> There are no strict lexical semantic limitations on the verbs that can appear in the PP slot of have/be + pp.
> Despite there being no strict limitations at the lexical semantic level, the use of the construction is not unconstrained.
5
> State holding at some point in the past + result with current relevance
> State holding at some point in the past
have/be + pp
simple past
I've had to move the dining-room table into my bedroom.
There was a whole stack of bills of lading piling up on my desk and I had to go through them all.
Towards an extra dimension of productivity?
6
[We arrived at Celeste’s, dripping with sweat […]. He asked me if I was 'all right then'. I said yes and I was hungry.]
Ik at haastig en dronk daarna mijn koffie. Vervolgens ging ik […]
I ate hastily and drank thereafter my coffee After_that went I
‘I ate very quickly and had my coffee. After that, I went […]
Ik heb in het restaurant, bij Celeste, gegeten, zoals gewoonlijk.
I have in the restaurant at Celeste’s eaten as usual
‘I had lunch at Celeste’s, as per usual.’
> Past event statement of fact
> Past event in storyline
have/be + pp
OVT
Towards an extra dimension of productivity?
7
> We find both state- and event-like verbs in the PP slot of have/be + pp.
> We find that this construction can be used to express certain meanings but not others.
> We conclude that have/be + pp is semantically constrained but that the relevant constraints cannot be captured by the lexical semantics of the verb in its PP slot.
Towards an extra dimension of productivity?
Proposal and challenge
9
Proposal
> We conclude that have/be + pp is semantically constrained but that the relevant constraints cannot be captured by the lexical semantics of the verb in its PP slot.
> In our semantic analysis, we hypothesize that the constraints on have/be + pp are to be derived from its competition with other tense/aspect categories (e.g., the simple past or OVT).
> The intuition is that there is a point at which constructions start to interact and each of them ends up with its share of semantic space, specializing in expressing certain meanings but not others.
10
Proposal
> We propose that the competition between constructions and resulting meaning specialization should be seen as involving a dimension of productivity that is complementary to the one identified by Perek (2016).
> Productivity in our sense is not measured by the range of lexical items that can fill its slots but by the range of different meanings it can express as opposed to other constructions.
11
Challenge
Productivity in our sense is not measured by the range of lexical items that can fill its slots but by the range of different meanings it can express as opposed to other constructions.
> How can we use corpora to determine the range of meanings conveyed by a construction?
> Is there a corpus-based operationalization of semantics beyond the lexical level?
Rising up to the challenge
13
Rising up to the challenge
Strategy
> We assume – in line with a usage-based view on language – that it has to be possible to come to a corpus-based operationalization of semantics beyond the lexical level.
> Shortcut (or first step): cross-linguistic comparison in multilingual corpora.
14
Rising up to the challenge
Intuitions
> The language-specific distribution of a construction reflects its meaning(s).
> Differences in the distribution of the same construction across languages reflect cross-linguistic differences in meaning.
> Cross-linguistic differences in meaning allow us to pinpoint different meanings.
15
Rising up to the challenge
16
Rising up to the challenge
> In van der Klis et al. (2022), we do this exercise for all French instances of have/be + pp in Chapters 1 to 3 of Camus’ L’Étranger.
> For each context, we determined for each language which (finite indicative) tense it used.
> Each context is represented by a tuple of tenses, e.g., <present perfect, pretérito perfecto compuesto, VTT, Perfekt, passé composé>.
> Extending Wälchli & Cysouw (2012)’s approach to parallel corpus analysis, we created MDS plots of the contexts based on these tuples.
17
Rising up to the challenge
English have/be+pp
Dutch
have/be+pp
Spanish
have/be+pp
German
have/be+pp
French
have/be+pp
> Extremely systematic
> All statistically relevant variation captured in 2 dimensions
> Semantically homogenous clusters
18
Rising up to the challenge
English
have/be+pp
Spanish
have/be+pp
Dutch
have/be+pp
German
have/be+pp
French
have/be+pp
Current relevance
Hodiernal events
Past events: statement of fact
Past events: part of storyline
States holding at some point in the past
19
Rising up to the challenge
> These clusters of contexts reflect the different meanings have/be+pp can express.
> The clusters group contexts with the same (finite indicative) tense tuples.
> We submit that these tense tuples are akin to vectors in lexical semantics and are a corpus-based operationalization of semantics beyond the lexical level.
> They allow us to determine the range of meanings have/be+pp can express in each of the languages.
20
Rising up to the challenge
> have/be+pp French
States holding at some point in the past
Past events: part of storyline
Past events: statement of fact
Hodiernal events
Current relevance
> have/be+pp German
> have/be+pp Dutch
> have/be+pp Spanish
> have/be+pp English
V V V V V
X V V V V
X X V V V
X X X V V
X X X X V
Conclusion
22
Conclusion
> On the basis of insights about the semantics of have/be+pp, we argued that there is a dimension of productivity that is subject to semantic constraints that go beyond the lexical level.
> For have/be+pp, we argued that this dimension of productivity should be defined by the range of meanings this construction can express in a given language.
> We argued that tense tuples – i.e. the tenses that different languages use in a given context – can be used as a translation corpus-based operationalization of meaning beyond the lexical level akin to vectors in lexical semantics.
23
Conclusion
> An extension of this approach is to inductively study the features that contexts with the same tense tuple share and use these as input for an empirically grounded sentence and discourse semantics of have/be+pp.
> Limitations of van der Klis et al. (2022) have been (partly) addressed in later work: Le Bruyn et al. (to appear a) extends the corpus from all French passé composés to all finite indicative tense forms, Le Bruyn et al. (in prep.) show that the same clusters of contexts emerge from chapters 1 and 16 of Harry Potter and the Philosopher’s Stone and its translations.
> Methodological considerations about the way we use translation corpora are developed in Le Bruyn et al. (to appear a) and in Le Bruyn et al. (to appear b).
24
References (selected)
Van der Klis, M., Le Bruyn, B., & de Swart, H. (2022). A multilingual corpus study of the competition between past and perfect in narrative discourse. Journal of Linguistics, 58(2), 423-457. [Open Access]
Le Bruyn, B., van der Klis, M., & de Swart, H. (to appear a). Variation and stability: the HAVE-PERFECT and the tense-aspect grammar of western European languages. [accepted for the OUP volume Beyond Time edited by Frank Brisard, Astrid De Wit, Carol Madden, Michael Meeuwis & Adeline Patard] [Prefinal version available via my webpage]
Le Bruyn, B., van der Klis, M., & de Swart, H. (to appear b). Parallel corpus research and target language representativeness: the contrastive, typological and Translation Mining traditions. [accepted for publication in a Languages Special Issue on Tense and Aspect across languages] [Prefinal version available via my webpage]
Wälchli, B., & Cysouw, M. (2012). Lexical typology through similarity semantics: Toward a semantic map of motion verbs. Linguistics, 50(3), 671-710.
Thank you for your attention!
26