1 of 43

Creating a mini resource grammar

Inari Listenmaa

University of Gothenburg

2017-08-16, Riga

2 of 43

Recap on earlier lectures

3 of 43

Computational Syntax, spring 2023

GF resource grammar

Inari Listenmaa

4 of 43

Recap on GF syntax

Functions

  • abstract:

fun Hello : Noun -> Greeting ;

  • concrete:

lin Hello x = "hello" ++ x ;

  • helper functions:

oper glue : Str -> Str -> Str = \x,y -> x ++ y ;

Categories

  • abstract:

cat Noun ;

  • concrete:

lincat Noun = Str ;

5 of 43

Recap on GF syntax

Records

  • definition:

lincat Det = {s : Str ; n : Number} ;

  • giving a value:

lin a_Det = {s = "a" ; n = Sg} ;

  • accessing a record (projection):

a_Det.s ; -- has value “a”

Tables

  • definition:

lincat Noun = Number => Str ;

  • giving a value:

car_N = table {Sg => "car" ; Pl => "cars" } ;

  • accessing a table (selection):

car_N ! Sg; -- has value “car”

6 of 43

Difference between records and tables

Records have named fields that can be of any type.

When projecting a field from a record, you must know the exact name:

{s = "hello" ; compl = "world"}.s ;

Tables are associations from key to value. Keys can only be of finite types (param).

When selecting a value from a table, you can use variables, wildcards, computation:

table {Sg => "car" ; Pl => "cars" } ! Sg ;

s = \\x => noun ! x ++ “foo” ++ adj ! x ;

7 of 43

Recap on GF syntax

Tuples

  • definition:

Str*Str ;

  • syntactic sugar for:

{p1 : Str ; p2 : Str} ;

  • giving a value:

<"hello","world">

Parameters

  • definition:

param Number = Sg | Pl ;param Case = Nom | Acc | Dat ;

  • usage: left-hand side of tables; inherent properties

lincat Noun = Number => Str ;lincat Det = {s : Str ; g : Number} ;

8 of 43

Motivation for RGL

abstract Hello = {� flags startcat = Greeting ;� cat

Greeting ; Recipient ;�� fun� Hello : Recipient -> Greeting ;� World : Recipient ;� Mum : Recipient ;� Friends : Recipient ;

}

concrete HelloEng of Hello = {

lincat

Greeting, Recipient = {s : Str} ;�� lin� Hello rec = {s = "hello" ++ rec.s} ;� World = {s = "world"} ;� Mum = {s = "mum"} ;� Friends = {s = "friends"} ;��}

9 of 43

Motivation for RGL

concrete HelloIce of Hello = {�� lincat � Greeting = {s : Str} ;

Recipient = {s : Str ; n : Number ;g : Gender} ;�

param� Gender = Fem | Masc | Neutr ;� Number = Sg | Pl ;�

lin� Hello rec = {s = case <rec.g,rec.n> of<Sg,Masc> => "sæll" ++ rec.s ;� <Sg,_> => "sæl" ++ rec.s ;� <Pl,Masc> => "sælir" ++ rec.s ;� <Pl,_> => "sælar" ++ rec.s } ;

World = {s = "heimur" ; g=Masc ; n=Sg} ;� Mum = {s = "mamma" ; g=Fem ; n=Sg} ;� Friends = {s = "vinir" ; g=Masc ; n=Pl} ;�}�

10 of 43

Hello with resource grammar

concrete HelloIce of Hello = open ParadigmsIce, SyntaxIce in {�� lincat � Greeting = Utt ; � Recipient = N ;� lin

Hello r = mkGreeting hello_Interj r ; � World = mkN "heimur" ; � Mum = mkN "mamma" ;� Friends = mkN "vinir" ; � oper � hello_Interj = mkInterj "sæll" ; � mkGreeting = ...

concrete HelloKaz of Hello = open ParadigmsKaz, SyntaxKaz in {�� lincat � Greeting = Utt ;Recipient = N ;� lin

Hello r = mkGreeting hello_Interj r ;World = mkN "әлем" ; � Mum = mkN "ана" ;� Friends = mkN "достар" ;oper hello_Interj = mkInterj "сәлем" ; mkGreeting = ...

11 of 43

Resource Grammar Library

  • Translation in general needs semantic predicates
  • ...but syntactic grammar is useful as library
  • Linguist implements resource grammar library, domain experts can use it in applications

(You can always write application grammars without RGL too!)

12 of 43

Now write your own!

Piece of cake right?

13 of 43

Mini resource grammar

14 of 43

Miniresource in words (1/6)

Sentence & Utterance

  • Build a (present tense) sentence from a polarity and a clause
  • Connect two existing sentences with a conjunction
  • Build a stand-alone utterance from a sentence or a noun phrase

Polarity

  • Pos, Neg

Tense

  • Only present (more will come later!)

15 of 43

Miniresource in words (2/6)

Clause

  • Build a clause from a noun phrase (= the subject) and a verb phrase

Verb phrase

  • Build a verb phrase by elevating a verb
  • Build a verb phrase from a two-place verb and a noun phrase (= the object)
  • Build a verb phrase from an adjective phrase ("big" → "is big")

Verb & Two-place verb

  • walk_V, arrive_V, love_V2, ...

16 of 43

Miniresource in words (3/6)

Noun phrase

  • Build a noun phrase from a determiner and a common noun
  • Build a noun phrase from a pronoun or a proper noun
  • Connect two existing noun phrases with a conjunction

Common noun

  • Build a common noun by elevating a noun
  • Add an adjective phrase to an existing common noun

Noun

  • house_N, tree_N, ...

17 of 43

Miniresource in words (4/6)

Adjective phrase

  • Build an adjective phrase by elevating an adjective

Adjective

  • big_A, small_A, green_A, ...

18 of 43

Miniresource in words (5/6)

Adverb

  • Build an adverb from a preposition and a noun phrase

Preposition

  • in_Prep, on_Prep, with_Prep

19 of 43

Miniresource in words (6/6)

Conjunction

  • and_Conj, or_Conj

Determiner

  • a_Det, the_Det, every_Det, …

Pronoun

  • i_Pron, youSg_Pron, they_Pron, …

20 of 43

Where to start?

Lexical categories / Morphology

  • Easy to concentrate on one thing
  • Can be reused for the full resource

Syntax

Both

  • See next slide!

21 of 43

Test your grammar regularly

22 of 43

Morphology & lexical categories

23 of 43

Nouns: what

What distinctions do I need?

  • In general / for mini resource

English

  • Variable number: singular / plural
  • Inherent animacy (who or which)
  • Inherent gender (correct referent pronoun)

Finnish

  • Variable number: singular / plural
  • Variable case: nominative, genitive and 12 more

What distinctions do I need?

  • In general / for mini resource

German

  • Variable number: singular / plural
  • Inherent gender
  • Variable case: nominative, genitive, accusative, dative

24 of 43

Nouns: what

What distinctions do I need?

English

  • Noun : Type = {s : Number => Str} ;
  • param Number = Sg | Pl ;

Finnish

  • Noun : Type = {s : Number => Case => Str} ;
  • param Case = Nom | Gen | … ;

What distinctions do I need?

  • In general / for mini resource

German

  • Noun : Type = {s : Number => Case => Str ;� g : Gender } ;
  • param Gender = Masc | Fem | Neutr ;

25 of 43

Nouns: how

English: Noun : Type = { s : Number => Str } ;

car_N = { s = table { Sg => “car” ; Pl => “cars” } ;

bus_N = { s = table { Sg => “bus” ; Pl => “buses” } ;

baby_N = { s = table { Sg => “baby” ; Pl => “babies” } ;

26 of 43

Nouns: paradigms

oper carNoun : Str -> Noun ;

carNoun car = { s = table { Sg => car ; Pl => car + “s” } ;

oper busNoun : Str -> Noun ;

busNoun bus = { s = table { Sg => bus ; Pl => bus + “es” } ;

oper babyNoun : Str -> Noun ;

babyNoun baby = { s = table { Sg => baby ; Pl => init baby + “ies” } ;

27 of 43

Nouns: smart paradigms

oper regNoun : Str -> Noun = \s -> case s of {

_ + ("s" | "ch") => { s = table { Sg => s ; Pl => s + "es" } ;

_ + ("y") => { s = table { Sg => s ; Pl => init s + "ies" } ;

_ => { s = table { Sg => s ; Pl => s + "s" }

} ;

28 of 43

Smart paradigms: Dutch

regNoun : Str -> Noun = \s -> case s of {

_ + ("a" | "o" | "y" | "u" | "oe" | "é") => mkNoun s (s + "'s") Utr ;

_ + ("oir" | "ion" | "je") => mkNoun s (s + "s") Neutr ;

? + ? + ? + _ +

("el" | "em" | "en" | "er" | "erd" | "aar" | "aard" | "ie") -- unstressed

=> mkNoun s (s + "s") Utr ;

_ + ("i"|"u") => mkNoun s (endCons s + "en") Utr ;

b + v@("aa"|"ee"|"oo"|"uu") + c@? => mkNoun s (b + shortVoc v c + "en") Utr ;

b + ("ei"|"eu"|"oe"|"ou"|"ie"|"ij"|"ui") + ? => mkNoun s (endCons s + "en") Utr ;

_ + "ie" => mkNoun s (s + "ën") Utr ;

b + v@("a"|"e"|"i"|"o"|"u") + c@? => mkNoun s (b + v + c + c + "en") Utr ;

_ => mkNoun s (endCons s + "en") Utr

} ;

29 of 43

Applying smart paradigms

> regNoun "meisje" s . NF Sg Nom => meisje s . NF Sg Gen => meisjes s . NF Pl Nom => meisjes s . NF Pl Gen => meisjes g . Neutr

> regNoun "muis" s . NF Sg Nom => muis s . NF Sg Gen => muis s . NF Pl Nom => muizen s . NF Pl Gen => muizens g . Utr

(later: how about the nice mkN function?)

30 of 43

Verbs (Italian)

  • Variable: Person, tense, mood, polarity
  • Inherent: Which auxiliary it takes for participle
  • Example: MiniresourceIta.gf#L265

31 of 43

Phrases

32 of 43

Phrasal categories

  • “Next step” from noun, adjective and verb:
    • Noun → CN, NP
    • Adjective → AP
    • V, V2 → VP
  • Morphology is there, how do we use it?
    • Agreement: select appropriate things from inflection tables
  • Can we get rid of some distinctions? Do we need to add new ones?
    • e.g. English case in NP

33 of 43

AP

  • Miniresource functions:

cat AP ;

fun UseA : A -> AP ;

  • A to AP: if A has variation (e.g. number & gender) that depends on its eventual head, generally AP keeps that
  • Remember that AP can be used to form both NPs and VPs!

34 of 43

CN and NP

  • Miniresource functions:

cat CN ;

fun UseN : N -> CN ;

fun AdjCN : AP -> CN -> CN ;

cat NP ;

fun DetCN : Det -> CN -> NP ;

fun ConjNP : Conjunction -> NP -> NP -> NP ;

  • N to CN
    • CN still has variable number
    • AP’s gender/class/... gets fixed
  • CN to NP: number gets fixed, case still variable

35 of 43

VP

  • Miniresource functions:

cat VP ;

fun UseV : V -> VP ;

fun ComplV2 : V2 -> NP -> VP ;

fun CompAP : AP -> VP ;

  • V to VP: keep subject agreement open
  • V2 to VP: V2’s object agreement becomes fixed; NP’s case becomes fixed

36 of 43

Syntax

37 of 43

Clauses

  • Miniresource functions:

cat S, Cl, Pol ;

fun PredVP : NP -> VP -> Cl ;

fun UsePresCl : Pol -> Cl -> S ;

fun ConjS : Conjunction -> S -> S -> S ;

  • NP & VP to Cl: VP fixes subject agreement (example)
  • Cl to S: fixes polarity
  • How about word order?
    • Maybe not needed in miniresource, but think where to add the feature!

38 of 43

Demo

39 of 43

Is this useful for more than exercise?

40 of 43

Effort

Full resource grammar: months

Mini resource grammar:

days to weeks

RGL

Wide-coverage

Wide-coverage grammar:

days if we have RGL

Mini

41 of 43

Effort

  • Wide-coverage = mostly allowing chunks in yellow grammar
  • If we only have structures in mini, can we still parse something from most sentences?

RGL

Wide-coverage

Mini

42 of 43

Conclusion

Writing miniresources is…

  • not scary at all! It’s just 6 slides of stuff
  • fun, even for languages you don’t know (well)
  • educational and useful, prepares for full resource
    • you’ll have to address many of the hard problems already in mini

43 of 43

Kiitos!

Paldies!

Thank you!