1 of 19

A practical example of linear equations in the “column picture” or “row picture”, bigger than the 2x2 and 3x3 made-up examples we use for hand calculations:

Radial Basis Function

Interpolation

Prof. Steven G. Johnson

MIT course 18.C06/6.C06, Fall 2024

2 of 19

Given (“training”) data at scattered points in multiple dimensions, how can one interpolate smoothly in between the points?

image: hec.usace.army.mil

Example:

Given measured temperatures at a few weather stations, how can you interpolate the temperatures T(x) to locations in between the stations?

3 of 19

Radial Basis Functions (RBFs): one of many methods

Image: https://shihchinw.github.io/2018/10/data-interpolation-with-radial-basis-functions-rbfs.html

distance r from data point x_k

Φ(r): “radial basis function” RBF

e.g. Gaussian / bell curve

put some kind of smooth “bump” function at the location of each data point:

& interpolated T(x) = linear combination of the RBFs with some coefficients c:

c₁

c₂

c₃

c₄

c₅

T(x)

each data point

has a “localized”

effect

cartoon

1d plot

“spread”

distance r

from x_k

(note: RBF Φ does not actually need to be localized!)

4 of 19

What linear combination (= what c’s)

of basis functions

matches the given data points?

Linear equation Ac=b!

Columns of A = basis function values!

b = data values to match (e.g. temperatures)!

5 of 19

RBF coefficients: Linear equation Ac = b!

must match measured data (x_j, T_j):

for j = 1,2, …, n

n equations in n unknowns (c_k) !

rhs b

unknowns c

n columns = RBFs centered at n data points

n × n matrix A

n rows = constraints to match n data values

distance r

from x_k

6 of 19

RBF coefficients: Linear equation Ac = b!

must match measured data (x_j, T_j):

for j = 1,2, …, n

n equations in n unknowns (c_k) !

rhs b

unknowns c

n columns = basis functions

centered at n points

n × n matrix A = A^T

n constraints

distance r

from x_k

7 of 19

Temperature interpolation example:

image: hec.usace.army.mil

Given measured temperatures at a few weather stations, use RBFs to interpolate the temperatures T(x) to locations in between the stations.

(x = 2-component coordinates)

(n = 5 equations)

8 of 19

Temperature interpolation example:

Given measured fictitious temperatures at a few weather stations, use RBFs to interpolate the temperatures T(x) to locations in between the stations.

(x = 2-component coordinates)

(n = 5 equations)

9 of 19

Temperature interpolation with Gaussian RBFs:

“spread” ~ R

Gaussian basis function

Spread R ~ data separation:

Nice smooth interpolation

(note: extrapolating outside data

⟶ nonsense)

10 of 19

Temperature interpolation with too-small Gaussian RBFs:

“spread” ~ R

Gaussian basis function

Spread R << data separation:

isolated “islands”

11 of 19

What if spread is too large?

RBFs are nearly constant

… A has nearly parallel columns

… nearly singular?

… seems bad?

[“ill-conditioned” — future 18.C06 topic!]

(In practice RBFs are usually combined with some low-degree polynomial basis functions like {1, x, y, xy} to be better behaved.)

12 of 19

Meta-question: Validation and choice of hyper-parameters?

How do we choose “hyper-parameters” like R (or Φ itself)?

How do we check (“validate”) that the interpolation is “good”?

Strategy 1: have some additional “validation” data (/ reserve some data) that is not used for interpolation/fitting, but is only used afterwards to check that the interpolation is accurate at those additional points.
Strategy 2: check interpolation against additional prior/physical knowledge (e.g. we know that temperature stays in certain range).
Strategy 3: check sensitivity to noise/small changes in training data

13 of 19

Radial Basis Functions 2.0:

An Example of Underdetermined Systems

14 of 19

Adding More Basis Functions? (“Augmenting” the RBFs.)

We saw that we could “guess wrong” on the basis functions, e.g. by making them too localized.

Maybe we can fix it by adding more basis functions?

Maybe try polynomials, for example: {1,x,y,x²,y²,xy}

new model (fitting) function:

15 of 19

As matrices:

RBFs + polynomials, for example: {1,x,y,x²,y²,xy}

Gives an n × (n+6) system of equations

Underdetermined system (“wide” matrix): # constraints n < # unknowns n+6

… i.e. columns not independent (not really a “basis”)

… i.e. infinitely many solutions d … which to pick?

16 of 19

Most common solution for RBF + polynomials:

For M = [A P] consisting of the square RBF part (A) and polynomial part (P), the solutions d = [c; p] consist of the RBF coefficients c and polynomial coefficients p.

Another common way to get a unique solution to the overdetermined problem is to require P^Tc = 0: make the RBF coefficients orthogonal to the polynomials at the x_k.

This gives a square system:

17 of 19

Alternative: minimum-norm solution:

A “wide” (underdetermined) problem Md=b has infinitely many solutions (N(M) ≠ {0}).

There are many application-specific strategies to pick one, but a useful common choice is to pick the one that minimizes ‖d‖: the minimum-norm solution.

ℝⁿ

all solutions d to Md=b

(“affine subspace” = subspace + shift)

N(M)

minimum-norm solution d_⋆

Minimum-norm solution (for full row rank):

d_⋆ = Mᵀ(MMᵀ)⁻¹b ( = M \ b in Julia for “wide” M)

Here, it makes sense! Make coefficients as small as possible … avoid huge terms that happen to cancel at the training points!

18 of 19

Temperature interpolation example:

“spread” ~ R = 10mi

too-localized

Gaussian basis functions

add {1,x,y,x²,y²,xy},

& find

minimum-norm coefficients

looks sensible!

19 of 19

Geometry of minimum-norm solution to Ax=b:

ℝⁿ

all solutions x to Ax=b

(“affine subspace” = subspace + shift)

N(A)

minimum-norm solution

x_⋆ ⟂ N(A) ⟺ x_⋆ ∈ C(Aᵀ)

Why does this give our formula:

x_⋆ = Aᵀ(AAᵀ)⁻¹b ?

( = A \ b in Julia for “wide” A)

By inspection, Ax_⋆=b (plug it in!)

By inspection, x_⋆ ∈ C(Aᵀ) (since x_⋆ = Aᵀ(some vector))