Defectors: A Large, Diverse Python Dataset for Defect Prediction
Mohammad Masudur Rahman Dalhousie University�masud.rahman@dal.ca |
WHs
Why
What
How do we do it?
Dataset Construction
Project Selection
Bug-fix Commit Collection
Bug Inducing Commit Collection
Bug Inducing Commits Filtration
Sampling Defect-free Commits
Formalizing The Dataset
Takeaway
Thank You!�Question?
Dataset Construction
Project Selection
Bug-fix Commit Collection
Dataset Construction Contd.
Bug Inducing Commit Collection
Bug Inducing Commits Filtration
[1] J. ́Sliwerski, T. Zimmermann, and A. Zeller, “When do changes induce fixes?” ACM sigsoft software engineering notes, vol. 30, no. 4, pp. 1–5, 2005.
[2] D. Spadini, M. Aniche, and A. Bacchelli, “Pydriller: Python framework for mining software repositories,” in Proceedings of the 2018 26th, ESEC/FSE, 2018, pp. 908–911
Dataset Construction Contd.
Collecting and Sampling Defect-free Commits
Formalizing The Dataset