1 of 37

Show Me A Function: More Than Meets The Eye

by

Chinedu Eleh

(Advisor: Dr. Hans Werner van Wyk)

1

2 of 37

Definition of Function

A function from a set to a set assigns to each element of exactly one element of

3 of 37

Using Simple Functions to Build Complex Ones

Functions we learn in precalculus, calculus, etc

polynomials
exponential
trigonometric
inverse
composite functions, etc

3

4 of 37

Polynomial Functions

4

Spline Interpolation
Finite elements
ReLU activation function

5 of 37

Spline Basis

Let be the indicator function of .

5

6 of 37

Spline Interpolation

6

7 of 37

Finite Element Basis

Finite elements method built on similar idea of spline basis
Basis could be constant, linear, quadratic, or higher order piecewise poly

7

8 of 37

Finite Element Approximation

8

Goal is to solve boundary value problems, say

Discretization of the form

is assumed, where , spanned by , is a finite dimensional approximation of the unknown infinite dimensional space.

Differential Form

Weak Form

Discretization

Linear System

9 of 37

ReLU Activation

9

The ReLU activation function is defined as

It can create sufficient nonlinearities in neural network layers to learn virtually any mapping

10 of 37

Neural Networks

10

Made up of composition of affine functions with activations creating nonlinearity where necessary

Can take any tensor input (CNN, RNN, VAE, etc)

is one of sigmoid, tanh, ReLU, linear, etc functions
is mostly linear, sigmoid, softmax depending if a regression, binary classification, or multiclass classification problem

11 of 37

Trigonometric Functions

Very well applicable in

Fourier transform
Activation functions (sinc)
Anywhere periodicity is desired

11

12 of 37

Exponential Functions

Exponential growth and decay
Density
Kernels (SVM, RKHS, covariance)
Activation functions (sigmoid, softmax)
Wavelets

12

13 of 37

Inverse Functions

Activation functions (arctan)
Loss function (e.g. log in cross entropy, KL divergence)

13

14 of 37

Function Discovery

Three main ways of discovering new functions

Calculus of variations
Statistics
Differential equations

Calculus of variations date back to Euler and Lagrange, statistical methods and differential equations have rich history as well, but part of state of the art

14

15 of 37

Calculus of Variations

15

A

B

Length of curve:

Area to be maximized:

The Classical Isoperimetric Problem: Determine a curve with a length of that connects points A to B, such that when combined with the line segment AB, forms the largest possible enclosed area.

16 of 37

Euler-Lagrange Equations

16

The maximizer of the constrained optimization is a section of the circle

which is a solution to the Euler-Lagrange (differential) equation

The isoperimetric problem is solved by a function.

17 of 37

18 of 37

Statistical Methods

18

Find equation of the line which passes through the points: and (slido only)

19 of 37

Question (slido only)

As a mathematician, in one sentence, describe

19

20 of 37

A Line Through Points?

20

Given a pencil, a ruler, and a pair of compass
Draw many circles and measure

the circumference
the radius

What is ?
Before was discovered, nobody knew is constant
However, given a circle, one could easily measure its radius and circumference

Abundance of Data

21 of 37

Yes, Using Tools from Linear Algebra

21

How to Solve

?

22 of 37

Moore-Penrose Inverse (Newton Method)

22

A dual formulation of the minimization problem is

the maximum likelihood estimate, where

and has 1 in its first dimension.

The Moore-Penrose Pseudo Inverse satisfies as a minimizer of the optimization problem

23 of 37

A Line Through Points

23

24 of 37

Moore-Penrose Inverse Sensitive to Outliers

24

25 of 37

Methods of Solving Least Squares Problems

Linear Least Squares:

Moore-Penrose Inverse
Newton Method

Nonlinear Least Squares:

Gradient Descent
Gauss-Newton Method
Levenberg-Marquardt method
Stochastic Gradient Descent

25

26 of 37

Remarks

Parallax errors can be minimized by statistical averages, but pose uncertainties in measurements
In heterogeneous media such as composites, geological media, gels, foams, and cell aggregates, these uncertainties could take any distribution, and in fact, could be undetermined useful material properties
An accurate description of a measured value would as well characterize uncertainties in the obtained value

26

Errors encountered in estimation are mostly parallax error

27 of 37

Differential Equations

The Euler-Lagrange equation is a differential equation
Rates are ubiquitous in day to day life

speed
acceleration
reaction rate
power
inflation rate
tax rate
unemployment rate
birth rate
interest rate
marginal

More rates from Newton’s laws and conservation laws in the natural and physical sciences

27

28 of 37

Functions From Differential Equations

28

Consider the simple elliptic equation

where could be

Young’s modulus of a material
Absolute permeability of rocks

29 of 37

Data Driven Modeling

29

For any of these problems, is never known but are easily, and in most cases, cheaply obtained

Finding from data is called data driven modeling

In certain community, is discovered through inverse problems

In general, may be heterogeneous

30 of 37

Regression

30

Close your eyes to the physical law and fit

where ‘ s are elements of any suitable basis known to the researcher such as

Suffers

inductive bias
futile adventure if solution lives outside the span of ‘s
prior knowledge of physical laws are not exploited

If solution lives in a subspace of the span of ‘s, techniques such as PCA are used to handle collinearity and dimensionality reduction

31 of 37

Weak Form (FEM)

31

Let . Multiplying the differential form by and integrating by parts gives

32 of 37

Finite Element Approximation - Revisit

32

Let be a finite dimensional subspace of in which we seek an approximate solution of the form

Within the Galerkin framework, we assume . So

simplifying to where and

33 of 37

Numerical Experiments

33

With , the load and 30 elements

34 of 37

Neural Networks - Experiments

34

We train a network of 3 fully connected layers, at 1000 sampled points, ReLU activation at the first two layers, MSE loss
Adam optimizer, learning rate of 0.0001

35 of 37

Convergence Requires High Epoch

35

36 of 37

Summary

We presented a brief trajectory of functions in mathematics, from Euler-Lagrange to the state of the art machine learning models
Gave insight on where the “least of the leasts” are applied in day to day life
Showed how functions are discovered from data via statistics and differential equations
Made connections between statistics, differential equations and calculus of variation
Whether you are interested in pure or applied mathematics, you are stuck with functions
The next time you think of pressing a button to get you a cup of coffee, I challenge you to think about the function behind the scene, no functions, no automation

37 of 37

Thanks for your attention