1 of 22

Incorporating External Knowledge to Enhance Tabular Reasoning

https://knowledge-infotabs.github.io/

J. Neeraja1*, Vivek Gupta2*, Vivek Srikumar2

1IIT Guwahati; 2University of Utah

2 of 22

TABULAR INFERENCE PROBLEM

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

2

  • The tabular natural language inference problem is similar to standard NLI

  • But here, the premises are tabular data

  • Task: to decide whether given hypothesis is true (entailment), false (contradiction) or undetermined (neutral) given a premise table

Check out InfoTabs (Gupta et al., 2020) https://infotabs.github.io

3 of 22

MOTIVATION

Recent work for tabular reasoning focuses on building sophisticated neural models.

Questions?

  • How models designed for the raw text adapt for tabular data?
  • How to represent data and incorporate knowledge into model?
  • Can better preprocessing of tabular information enhance model?

3

4 of 22

CHALLENGES

  1. Poor Representation of Tabular Information
  2. Missing Implicit Lexical Knowledge
  3. Presence of Distracting Information
  4. Missing Domain Knowledge about Keys

4

Can we fix these problems by changing how tabular information is provided to a standard RoBERTa model?

5 of 22

CHALLENGE: POOR TABLE REPRESENTATION

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

5

  • In Gupta et al., 2020,

Universal template: “The k of t are v.”

The Founded of New York Stock Exchange are May 17, 1792; 226 years ago.

  • Most sentences are ungrammatical

6 of 22

SOLUTION: BETTER PARAGRAPH REPRESENTATION

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

6

  • Entity specific templates : use value entity types DATE or CARDINAL or BOOL

New York Stock Exchange was founded on May 17, 1792; 226 years ago.

  • Add category information

New York Stock Exchange is an organization.

More grammatical and meaningful sentences

7 of 22

CHALLENGE: MISSING IMPLICIT LEXICAL KNOWLEDGE

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

7

  • Limited training data
  • Interpreting hypernym words like ‘fewer’, and ‘over’ and negations like ‘neveror ‘not’.

e.g.

H2`: Fewer than 2,500 stocks are listed in the NYSE

H2`: contradictory

8 of 22

SOLUTION: IMPLICIT KNOWLEDGE ADDITION

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

8

Can pre-training on large dataset help?

  • Pre-training with MNLI data
  • Then, fine-tune on InfoTabS

Exposes model to diverse lexical constructions

Representation is better tuned for the NLI task

e.g.

H2`: Fewer than 2,500 stocks are listed in NYSE

H2`: entailed

9 of 22

CHALLENGE: PRESENCE OF DISTRACTING INFORMATION

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

9

  • Given hypothesis, limited rows are relevant
    • In H1 and H2, row with key No. of listings is sufficient.
    • Similarly, for H3, row with key Volume is sufficient.
  • BERT also has tokenization limit
    • longer tables are cropped

10 of 22

SOLUTION: DISTRACTING ROW REMOVAL

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

10

Select only rows relevant to hypothesis

Use alignment based retrieval algorithm with fastText vectors (Yadav et al. (2019, 2020))

E.g. for H1 & H2, prune table to

row No. of listings is sufficient.

11 of 22

CHALLENGE: MISSING DOMAIN KNOWLEDGE ABOUT KEYS

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

11

For H3, we need to interpret the key Volume in the financial context.

In capital markets, volume, is the total number of a security that was traded during a given period of time.

rather than

In thermodynamics, the volume, of a system is an extensive parameter for describing its thermodynamic state.

12 of 22

SOLUTION: EXPLICIT KNOWLEDGE ADDITION

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

12

Add explicit information to enrich keys

This improves model’s ability to disambiguate meaning of keys

13 of 22

SOLUTION: EXPLICIT KNOWLEDGE ADDITION

In this example from the InfoTabS dataset (Gupta et al., 2020),

H1: entailed ; H2: contradictory ; H3: neutral

13

Approach

  • Use BERT on wordnet examples to find key embeddings
  • Get key embeddings from premise using BERT
  • Find the best match and add it definition.

For H3, add to the table in the end:

Volume : total number of a security that was traded during a given period of time.

14 of 22

PROPOSED SOLUTION

14

Better Representation

Explicit Knowledge Addition

Distracting Row Removal

Implicit

Knowledge

Addition

Original Table

RoBERTa

Model

15 of 22

RESULTS AND ANALYSIS

InfoTabS dataset splits :

  • α1 contains table from same domain (similar to dev & train set)

  • α2 has examples from same domain but entail-contradict label (e.g. ‘over’ to ‘under’) flipped by minimal change i.e. adversarial.

  • α3 is zero-shot cross domain tables (exclusive from train set domains)

15

Check out InfoTabS: https://infotabs.github.io

16 of 22

RESULTS AND ANALYSIS

16

17 of 22

RESULTS AND ANALYSIS

17

Human performance

18 of 22

RESULTS AND ANALYSIS

Observation

Overall Pre-processing improves performance

Ablation : all changes needed, knowledge is the most important

18

19 of 22

RESULTS AND ANALYSIS

Observations

  • Implicit knowledge improve α1, α2 & α3
  • reflect model learning implicit knowledge

19

20 of 22

RESULTS AND ANALYSIS

Observations

  • Distracting row removal improve α3
  • α3 longer tables → BERT tokenization prune relevant rows

20

21 of 22

RESULTS AND ANALYSIS

Observations

  • +explicit knowledge help α3 improvement
  • α3 is zero-shot → keys information is needed.

21

22 of 22

CONCLUSION

  • Effective preprocessing techniques improve tabular reasoning, case study on tabular inference show performance improvement.
  • Proposed preprocessing lead to significant improvements especially on the adversarial sets of tabular inference dataset (InfoTabS).
  • Solutions applicable to question answering and generation problems which involve both tabular and textual inputs, especially for adversarial evaluation.
  • We recommend that modifications should be standardized across other table reasoning tasks.

Check out Knowledge_InfoTabs: https://knowledge-infotabs.github.io/

22