1 of 1

Towards Universal Unfolding of Detector Effects in High-Energy Physics using Denoising Diffusion Probabilistic Models

Interdisciplinary Research Achievement �Introduces a novel approach using conditional Denoising Diffusion Probabilistic Models (cDDPM) to unfold detector effects in HEP data. We demonstrate that a single cDDPM, trained on diverse particle data and incorporating statistical moments of various distributions, can serve as a “generalized” unfolder by performing multidimensional object-wise unfolding for multiple physics processes without explicit assumptions about the underlying distribution, thereby minimizing bias. Accepted to NeurIPS Workshop on Machine Learning for the Physical Sciences.

Impact on Artificial Intelligence�The proposed approach demonstrates, through an application to unfolding problems in HEP, that enhancing inductive bias in the training phase of an algorithm improves its generalization potential. The cDDPM algorithm is a non-iterative and flexible posterior sampling approach that exhibits a strong inductive bias. By making use of moments as conditionals, a more flexible unfolder is created that is not strictly tied to a set selected prior distributions, but enables generalizing unfolding to unseen processes without explicitly assuming the underlying physics distributions.

Impact on Fundamental Interactions

Correcting for detector effects in experimental data, particularly through unfolding, is critical for enabling precision measurements in HEP. However, traditional methods face challenges in scalability, flexibility, and dependence on simulations, and they eliminate essential correlations between unfolded distributions, limiting the data interpretation. With cDDPM, data from unseen physics processes can be unfolded without significant bias, and the object-based unfolding preserves correlation between unfolded data, allowing for a comprehensive interpretation of the results and thus a better understanding of the fundamental interaction of nature.

Camila Pazos (Tufts/IAIFI), Shuchin Aeron (Tufts/IAIFI), Pierre-Hugues Beauchemin (Tufts/IAIFI), Vincent Croft, Zhengyan Huan (Tufts), Martin Klassen, Taritree Wongjirad (Tufts/IAIFI)

The NSF Institute for Artificial Intelligence and Fundamental Interactions (IAIFI) is �supported by National Science Foundation under Cooperative Agreement PHY-2019786

http://iaifi.org/

The figure above illustrates the effectiveness of our approach in two key scenarios. Panel (a) shows an “unknown” process created by combining data from multiple known processes. Here, the generalized unfolder outperforms a “dedicated” unfolder, which is designed to unfold only a single specific physics process. Panel (b) provides further evidence of the generalized unfolder’s flexibility, demonstrating its ability to accurately unfold data from graviton production (generated in the context of large extra-dimension scenarios) accompanied by jets, a completely new physics process absent from the training phase. In both cases, the generalized unfolder achieves accuracy within typical LHC uncertainty budgets.

Outlook & References�Further investigation is needed to determine the extent of the cDDPM’s inductive bias and its tolerance to variations in the underlying physics processes. Understanding these aspects will help refine the method and ensure its robustness across a wide range of scenarios. Additionally, the current studies were performed on QCD jets, and extending this method to other particle types is necessary for its comprehensive application in data analysis. Addressing particles outside detector thresholds and accounting for systematic and experimental uncertainties are crucial improvements needed to fully realize the method’s potential in practical applications. [1] https://arxiv.org/pdf/2406.01507