1 of 31

Optimising inflationary features –�the Bayesian way

Jan Hamann

based on arXiv:2110.XXXXX

with

Julius Wons

International Joint Workshop on the SM and Beyond

NTHU, 14th-15th October 2021

2 of 31

Inflation

Inflaton field ϕ

Inflaton potential V(ϕ)

field perturbation

𝛿ϕ

curvature perturbation

𝛿R

3 of 31

Initial conditions for structure formation

curvature perturbation

𝛿R

CMB

Large Scale

Structure

4 of 31

Standard inflation: smooth power spectrum

curvature perturbation

𝛿R

primordial power spectrum

5 of 31

Standard inflation: smooth power spectrum

Assumptions

  • Slow-roll attractor reached
  • Smooth inflaton potential
  • Bunch-Davies vacuum
  • Other fields can be integrated out

primordial power spectrum

6 of 31

Inflation with features

Assumptions

  • Slow-roll attractor reached
  • Smooth inflaton potential
  • Bunch-Davies vacuum
  • Other fields can be integrated out

break any of these

generically: oscillations in k or ln k

Features!

[Chluba, JH, Patil 2015]

7 of 31

Features models: examples

ordinary power law

modulation

Linear oscillation model

(effects periodic in conformal time)

Logarithmic oscillation model

(effects periodic in cosmic time)

8 of 31

Fit to CMB data

Planck data

CMB temperature

angular power spectrum

residuals with respect to

LCDM best fit

9 of 31

Complications…

  • Complex structure of likelihood function
  • Increased requirements for numerical accuracy
    • Computation of CMB spectra
    • Likelihood evaluation

Takes O(1 min) to calculate likelihood for one single combination of parameters

Commonly used Markov chain Monte Carlo methods are very inefficient here

10 of 31

Bayesian optimisation

Step 1: Regression

Guess the shape of the function based on known function values (``data’’)

Step 2: Selection

Decide at which point to evaluate the next function value

Goal: find global maximum of function

(and learn general shape in the process)

11 of 31

Bayesian optimisation

Step 1: Regression

Guess the shape of the function based on known function values (``data’’)

Step 2: Selection

Decide at which point to evaluate the next function value

Goal: find global maximum of function

(and learn general shape in the process)

Gaussian

Process

Regression

Acquisition function:

Expected Improvement

12 of 31

Gaussian Process Regression (GPR)

Data: (xi,yi) Covariance of the data: Σij

13 of 31

Gaussian Process Regression (GPR)

?

Data: (xi,yi)

14 of 31

Gaussian Process

  • Imagine for each x, f(x) is a random Gaussian variate drawn from

mean

variance

draw a sample from

15 of 31

Gaussian Process

  • Gaussian process is specified by

  • Example covariance function:

covariance function

hyperparameters

prior width

correlation length

y

x

y

x

larger/smaller

A and L

16 of 31

Gaussian Process Regression (GPR)

  • Take function space generated by Gaussian process GP as prior probability distribution for f
  • Consider data (xi,yi) as condition
  • GP|(xi,yi) is the posterior probability distribution for f (this is still an infinite space of functions, but one can now evaluate, e.g., expectation value and covariance)

17 of 31

Gaussian Process Regression

Data: (xi,yi)

Covariance of the data: Σij

Covariance function: K(x,x’)

Test values: xi*

Input

Output

Target means: f(xi*)

Covariance of the targets: Σij*

Marginal likelihood: E(h,y|x)

hyperparameters: h

straightforward

linear algebra

probability of the model given the data

18 of 31

Gaussian Process Regression

Data: (xi,yi)

Covariance of the data: Σij

Covariance function: K(x,x’)

Test values: xi*

Input

Output

Target means: f(xi*)

Covariance of the targets: Σij*

Marginal likelihood: E(h,y|x)

hyperparameters: h

straightforward

linear algebra

probability of the model given the data

Maximise marginal likelihood as function of hyperparameters

=

Let the data decide on the most appropriate Gaussian process!

19 of 31

Gaussian Process Regression

20 of 31

Where to draw the next sample?

Exploration?

Exploitation?

21 of 31

Where to draw the next sample?

Exploration?

Exploitation?

Define an acquisition function dependent on GPR mean and uncertainty

Pick value that maximises acquisition function

Acquisition function:

Expected Improvement

22 of 31

Bayesian optimisation

Expected improvement

Gaussian Process Regression

iteration #

uncertainty

GPR mean

true function

data

1

23 of 31

Bayesian optimisation

Expected improvement

Gaussian Process Regression

iteration #

uncertainty

GPR mean

true function

data

1

2

24 of 31

Bayesian optimisation

Expected improvement

Gaussian Process Regression

iteration #

uncertainty

GPR mean

true function

data

1

2

3

25 of 31

Bayesian optimisation

Expected improvement

Gaussian Process Regression

iteration #

uncertainty

GPR mean

true function

data

1

2

3

4

26 of 31

Bayesian optimisation

Expected improvement

Gaussian Process Regression

iteration #

uncertainty

GPR mean

true function

data

1

2

3

4

5

27 of 31

Pros and cons of Bayesian Optimisation

+ high efficiency

+ excellent at finding global maximum of complicated functions

+ very good at determining overall shape, profiles of function

o currently no evaluation of marginalized posterior or Bayesian evidence (can in principle be done though!)

- scales very unfavourably with number of dimensions (realistically it won’t work well for more than 5-6 dimensions)

28 of 31

Bayesian optimization with feature models

[Planck inflation 2018]

using nested sampling

red dots: our results

  • Two orders of magnitude fewer function evaluations
  • Much better at finding global and local extrema

29 of 31

Feature best-fits vs. Planck residuals

30 of 31

Evidence for features?

Simulations of featureless Planck-like data

[Planck inflation 2015]

Logarithmic oscillation model

Linear oscillation model

It would require 𝛥χ2 > 20 to claim a detection

31 of 31

Conclusions

  • Inflationary features: unique signatures of new physics
  • Bayesian optimisation is a very efficient machine-learning technique for extremising unknown functions
  • It can also be applied to cosmological parameter estimation
  • We’ve improved on previous MCMC-based results for inflationary feature searches
  • Paper and code out soon!