1 of 20

A Study on Reproducibility and Replicability of Table Structure Recognition Methods

Kehinde Ajayi, Muntabir Choudhury, Sarah Rajtmajer, Jian Wu

1Old Dominion University, Norfolk, Virginia

2IST, Pennsylvania State University, University Park, PA, USA

@AjayiKehindep @WebSciDL

1

International Conference on Document Analysis and Recognition (ICDAR) 2023

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL 2023 Web Science and Digital Libraries Research Expo

2 of 20

Reproducibility and Replicability of AI Papers

  • Reproducibility refers to obtaining consistent computational results using the same data, methods, code, and conditions of analysis.
  • Replicability is obtaining consistent results on a different but similar dataset using the same methods (Goodman et al., 2016).

2

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

3 of 20

Status of Reproducibility and Replicability of AI Papers

  • Current studies focused on:
    • Availability of codes and data URL (Salsabil et al., 2022).
    • Meta-level information such as detailed documentation of experiments (Gunderson et al., 2018).
    • Executability of codes (Pimentel et al., 2019)
  • Only few works conducted direct reproduction of original results (Olorisade et al., 2017)
  • Replication studies of AI papers addressing the same topic are rarely conducted.

3

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

4 of 20

Why Does Reproducibility and Replicability of AI Papers Matter?

  • AI is being used in various fields, including healthcare, finance, and defense.
  • Ability to reproduce and replicate AI research findings is crucial to ensure that the technology is reliable, accurate, and effective.

4

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

5 of 20

Use Case: Table Structure Recognition (TSR)

5

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • TSR is a subdomain of document analysis that focuses on identifying rows and columns of detected tables in a digital document.

6 of 20

Why TSR?

  • Many learning-based methods on TSR have been proposed and reportedly achieved high performance but not all of them were evaluated on the same datasets.
  • Standard benchmarks such as ICDAR 2013 and ICDAR 2019 (Task B2) are available in open competitions.

6

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

7 of 20

Research Questions

  • What is the data and code accessibility of TSR papers we sampled?
  • Are the accessible methods executable without contacting the original authors?
  • What is the status of reproducibility based on our criteria?
  • What reasons caused results to be not reproducible?
  • What is the status of replicability with respect to a similar dataset?
  • What is the status of replicability with respect to the new dataset?

7

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL 2023 Web Science and Digital Libraries Research Expo

8 of 20

Research Workflow

8

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

Downloaded 25 papers but narrow down to 16 which use deep learning methods.

Inspect papers for code URL or search online.

Downloaded codes and data and deploy locally.

GenTSR

Codes executed without error and results are within a deviation of 10% of F1-score.

Results are better by more than 10% deviation of F1-score.

9 of 20

Data

9

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • ICDAR 2013: Crawled from European and US government websites.
  • ICDAR 2019 (Task B2): Consists of modern and historical tables.
    • Modern dataset consists of scientific papers, forms, and financial documents.
    • Historical dataset consists of hand-written accounting ledgers, and train schedules.
  • GenTSR: Consists of 386 tables from 6 scientific domains.

10 of 20

RQ1: What is the data and code accessibility of TSR papers we sampled?

10

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • Out of 16 papers we surveyed, 8 papers made their source code and data publicly available.
  • 3 papers made only the codes available.
  • 5 papers did not provide either codes or data, making it difficult to validate the results of these methods without private communication with the original authors.

11 of 20

RQ2: Are the accessible methods executable without contacting the original authors?

11

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • Out of 11 papers with accessible data or codes, the codes of 5 papers were executable without contacting the original authors.
  • The source codes of 5 papers were not executable.
  • The code of one paper was executable after we contacted the original authors.

12 of 20

Metrics for computing Reproducibility and Replicability

12

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • We used the Intersection over Union (IoU) with F1-score metric.
  • IoU is a measure used to evaluate the performance of an algorithm in detecting objects within an image.

13 of 20

RQ3: What is the status of reproducibility based on our criteria?

13

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • 4 out of the 6 executable TSR methods were labeled reproducible, 1 paper was labeled partially-reproducible, and 1 paper was labeled not-reproducible.

14 of 20

RQ4: What reasons caused results to be not reproducible?

14

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • Data and code availability
  • Portability
  • Documentation
  • Dependency and compatibility issues
  • Data and code durability

15 of 20

RQ5: What is the status of replicability with respect to a similar dataset?

15

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • Out of the 6 methods that were either executable or reproducible, only 2 papers were replicable under certain IoUs.

16 of 20

RQ6: What is the status of replicability with respect to the new dataset?

16

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

  • None of the 4 methods that allow inference on custom data was replicable with respect to the GenTSR dataset.

17 of 20

Discussion and Future Work

  • Replicability is more challenging and data dependent.
  • Most existing benchmark datasets of TSR models are not diverse enough.
  • The GenTSR dataset contains scientific tables from 6 domains.

17

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

18 of 20

Discussion and Future Work

  • Sample size is relatively small.
  • Reproducibility is not necessarily associated with ranking.

18

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

19 of 20

Summary

  • This work presents a study of reproducibility and replicability considering 16 papers on TSR between 2017-2022
  • We directly reproduce the results reported in the original papers.
  • We then tested executable methods on an alternate benchmark dataset similar to the one used in the original paper as well as a new dataset from 6 scientific domains.
  • 12 out of 16 papers we examined were not fully reproducible.
  • Only 2 papers were identified as conditionally replicable on a similar dataset
  • None of the papers was identified as replicable with respect to the new dataset.

19

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL

20 of 20

Acknowledgements

  • This work was partially supported by the Institute of Museum and Library Services, and by the Defense Advanced Research Projects Agency (DARPA) under cooperative agreement.

20

A Study on Reproducibility and Replicability of Table Structure Recognition Methods @AjayiKehindep @WebSciDL