Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents
Sayar Ghosh Roy 1, Anshul Padhi 1, Risubh Jain 1, Manish Gupta 1, 2, Vasudeva Varma 1
International Institute of Information Technology, Hyderabad 1
Microsoft, India 2
Proceedings of the 33rd ACM Conference on Hypertext and Social Media (HT '22)
June 28 — July 1, 2022, Barcelona, Spain
Introduction
Related Work on Document-level Popularity Prediction
Two types of popularity prediction problems based on choice of popularity surrogate:
For informative documents (including online news), a preferred surrogate of popularity has been pageview hits [8, 10, 11]
Intuitively, pageviews captures the generic browsing trends of the population not limited to social media actions
Sentence-Specific Information Popularity
Assigning Sentence-Specific Popularity Labels
News document D with sentences [s1, s2, ..., sN]
Task Description
Inspired by research on document popularity prediction [1, 12]
Some document-popularity forecasting approaches rely on post-publication signals like pageview hits in the first half-hour after publication [1, 9], but
Task Formulation
Query-insensitive relative normalized scoring of sentences
Figure 1: Task outline. Looking solely at document text as input, forecast prospective sentence-specific popularity scores
InfoPop Dataset
Figure 2: InfoPop: news sources (on x-axis) with their corresponding #documents (on y-axis)
InfoPop Dataset
Figure 3: InfoPop: Distribution of #sentences per document
Proposed Approach
Salience Prediction versus Text Summarization
We only capture salience & expect very similar sentences to have similar labels
Auxiliary Subtasks
Table 1: Selected sentences (in order) from a document in Sentence Salience Prediction dataset with 3 types of salience labels based on ROUGE 1, ROUGE 2, and ROUGE L
Neural Architectures
Neural Architectures
Figure 4: BaseReg: RNN for Sentence Sequence Regression
Figure 5: BERTReg: BERT for Sentence Sequence Regression
Evaluation Metrics
Sentence-Specific Popularity Forecasting Results
Table 2: Sentence Popularity Forecasting Results
Sentence-Specific Popularity Forecasting Results
Performance Enhancement due to Transfer Learning
Attribute the performance enhancement to 2 factors
Performance of Approaches on Auxiliary Subtasks
Table 3: Performance of various approaches on auxiliary transfer learning subtasks (S1, S2, SL)
Task Comparison
Popularity and Salience
Salient sentences capture summary-inclusion worthy ideas that are central to core semantics of an article [18]
Consider a from a particular article (with ID 34499) within InfoPop: “Weinsheimer has spent 27 years at DOJ, where he tried homicide and public corruption cases.”
Popularity and Salience
Table 4: Selected sentences from a document in InfoPop with their true and forecasted popularities and predicted salience. Popularity forecasts are from our best performing model on nDCG (BERTReg with TL = SL). Salience predictions are based on BERTReg trained on S1.
[TPL: True Popularity Label, FPL: Forecasted Popularity Label, PSL: Predicted Salience Label, TPR: True Popularity Rank, FPR: Forecasted Popularity Rank, PSR: Predicted Salience Rank]
Empirical Cross-task Evaluation
Table 5: Cross-task evaluation − performance of BERTReg trained for popularity forecasting (PF) evaluated on salience prediction and vice-versa
Conclusions
Thank You
References
[1] Yaser Keneshloo, Shuguang Wang, E. Han, and Naren Ramakrishnan. 2016. Predicting the Popularity of News Articles. In SDM.
[2] Sotiris Lamprinidis, Daniel Hardt, and Dirk Hovy. 2018. Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-Task Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 659–664. https://doi.org/10.18653/v1/D18-1068
[3] Nuno Moniz and Luís Torgo. 2018. Multi-Source Social Feedback of Online News Feeds. CoRR https://archive.ics.uci.edu/ml/datasets/abs/1801.07055 (2018).
[4] Georgios Rizos, Symeon Papadopoulos, and Yiannis Kompatsiaris. 2016. Predicting News Popularity by Mining Online Discussions. In Proceedings of the 25th International Conference Companion on World Wide Web (Montréal, Québec, Canada) (WWW ’16 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 737–742. https://doi.org/10.1145/2872518.2890096
[5] Alexandru Tatar, Panayotis Antoniadis, Marcelo Dias de Amorim, and Serge Fdida. 2012. Ranking News Articles Based on Popularity Prediction. In 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 106–110. https://doi.org/10.1109/ASONAM.2012.28
[6] Md. Taufeeq Uddin, Muhammed Jamshed Alam Patwary, Tanveer Ahsan, and Mohammed Shamsul Alam. 2016. Predicting the popularity of online news from content metadata. In 2016 International Conference on Innovations in Science, Engineering and Technology (ICISET). 1–5. https://doi.org/10.1109/ICISET.2016.7856498
References
[7] Anton Voronov, Yao Shen, and Pritom Kumar Mondal. 2019. Forecasting Popularity of News Article by Title Analyzing with BN-LSTM Network. In Proceedings of the 2019 International Conference on Data Mining and Machine Learning (Hong Kong, Hong Kong) (ICDMML 2019). Association for Computing Machinery, New York, NY, USA, 19–27. https://doi.org/10.1145/3335656.3335679
[8] Anthony Chen, Pallavi Gudipati, Shayne Longpre, Xiao Ling, and Sameer Singh. 2021. Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 4472–4485. https://doi.org/10.18653/v1/2021.acl-long.345
[9] Mohamed Ahmed, Stella Spagna, Felipe Huici, and Saverio Niccolini. 2013. A Peek into the Future: Predicting the Evolution of Popularity in User Generated Content. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining (Rome, Italy) (WSDM ’13). Association for Computing Machinery, New York, NY, USA, 607–616. https://doi.org/10.1145/2433396.2433473
[10] Alexander Pugachev, Anton Voronov, and Ilya Makarov. 2020. Prediction of News Popularity via Keywords Extraction and Trends Tracking. Recent Trends in Analysis of Images, Social Networks and Texts 1357 (2020), 37 – 51.
[11] Yun-Zhu Song, Hong-Han Shuai, Sung-Lin Yeh, Yi-Lun Wu, Lun-Wei Ku, and Wen-Chih Peng. 2020. Attractive or Faithful? Popularity-Reinforced Learning for Inspired Headline Generation. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05 (Apr. 2020), 8910–8917. https://doi.org/10.1609/aaai.v34i05.6421
References
[12] Shivashankar Subramanian, Timothy Baldwin, and Trevor Cohn. 2018. Content-based Popularity Prediction of Online Petitions Using a Deep Regression Model. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 182–188. https://doi.org/10.18653/v1/P18-2030
[13] Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. 2016. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. arXiv preprint arXiv:1611.04230 (2016).
[14] Ruipeng Jia, Yanan Cao, Haichao Shi, Fang Fang, Yanbing Liu, and Jianlong Tan. 2020. DistilSum: Distilling the Knowledge for Extractive Summarization. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management (Virtual Event, Ireland) (CIKM ’20). Association for Computing Machinery, New York, NY, USA, 2069–2072. https://doi.org/10.1145/3340531.3412078
[15] Karl Moritz Hermann, Tomás Kociský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching Machines to Read and Comprehend. In NIPS. 1693–1701. http://papers.nips.cc/paper/5945-teaching-machines-to-read-and-comprehend
[16] Jason Phang, Thibault Févry, and Samuel R. Bowman. 2018. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks. ArXiv abs/1811.01088 (2018).
[17] Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey, and Noah A. Smith. 2020. Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks. arXiv:2004.10964 [cs.CL]
[18] Ming Zhong, Pengfei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu, and Xuanjing Huang. 2020. Extractive Summarization as Text Matching. In Proceedings of the 58th Annual Meeting of the ACL. ACL, Online, 6197–6208. https://doi.org/10.18653/v1/2020.acl-main.552