Text Obsoleteness Detection using Large Language Models
Rishav Ranaut1, Sriparna Saha1, Adam Jatowt2, Manish Gupta3
1Indian Institute of Technology, Patna, India , 2University of Innsbruck, Austria, 3Microsoft
- Information in large-scale repositories like Wikipedia quickly becomes outdated.
- Manual tracking of content obsolescence is infeasible at scale.
- LLMs have potential to reason over time-sensitive changes and predict content expiry.
- Timely content updates are critical for applications like search, question answering, and regulatory compliance.
Develop a Multitask Learning (MTL) framework using LLMs for:
- Semantic Update Detection (SUD) – identifying factual changes between text versions.
- Semantic Update Necessity Prediction (SUNP) – predicting if content will need updates in the future.
- Curate a novel dataset SemUpdates from frequently revised Wikipedia articles.
Examples of Data Generated for SemUpdates Dataset
Summary of Dataset Statistics
Performance comparison for SUD SUNP Task
Performance of LLMs in MT finetuning
Performance of LLM’s in MTL with separate task heads
- Fine-tuning significantly boosts LLM performance (avg. +18.5% SUD, +23.5% SUNP) over zero/few-shot baselines.
- Qwen2 achieves highest accuracy (82% SUD, 77% SUNP) in multitask learning.
- Mistral yields best results in task-specific fine-tuning setups.
- Task-specific heads in multitask setup offer better specialization without sacrificing efficiency.
- Results shows LLMs can effectively perform Semantic Update Detection (SUD) and Semantic Update Necessity Prediction (SUNP)
- Introduced SemUpdates, a curated dataset from frequently edited Wikipedia content.
- Designed an automated pipeline to extract and process real-world factual changes.
- Fine-tuning boosts performance, with Qwen2 best in multitask setups and Mistral leading in single-task tuning.