1 of 88

Handling Uncertainty in

Estimation Problems with Lie Groups

Matías Mattamala

matias@robots.ox.ac.uk

25/01/2020

2 of 88

Motivation

2

Optimization-based estimation algorithm

Measurements/ models with uncertainties

Estimated variables

with uncertainties

How to ensure everything is consistent?

3 of 88

Contents

  1. Factor graphs
    1. Brief overview on the inputs and outputs
  2. Lie Groups
    • Review of the meaning and main operators
  3. Bringing everything together
    • How to combine the previous ideas consistently in an estimation context

3

4 of 88

Part 1: Factor graphs

5 of 88

Lightspeed review

5

X1

X2

X3

p1

p2

Xn

Xn+1

X3

A factor graph factorizes a function

6 of 88

Lightspeed review

6

X1

X2

X3

p1

p2

Xn

Xn+1

X3

Variables (unknown)

7 of 88

Lightspeed review

7

p1

p2

Factors

(known)

X1

X2

X3

Xn

Xn+1

X3

Factors indicate how the variables relate each other

8 of 88

Factor graphs can be used to describe many problems

8

9 of 88

DRS examples

9

KINS

VTR

TOFG

Xi

X1

X2

X3

X4

Work in progress

Under review

Work in progress

10 of 88

Factorized function

10

X1

X2

X3

p1

p2

Xn

Xn+1

X3

A factor graph factorizes a function

11 of 88

Factors as distributions

11

In general, the factors are given by probability distributions

So the factor graph represents a factorization of the joint distribution of many variables

12 of 88

Finding the unknowns

12

In the factor graph, the variables represent parameters of the distribution that must be found

We assume the parameters are single quantities (i.e, not distributions themselves)

Any parameter estimation method can be used

13 of 88

Finding the unknowns with MAP

13

The most common solution is maximum a posteriori (MAP)

14 of 88

Finding the unknowns with MAP

14

The most common solution is maximum a posteriori (MAP)

Since maximizing a product is difficult, we generally minimize the negative log-likelihood instead

*please note the negative sign convert the maximization into a minimization

15 of 88

Now it comes the assumptions

15

To make things easier, we assume the distributions are Gaussian

However, we don’t say that Xi is Gaussian itself.

16 of 88

Now it comes the assumptions

16

We say the error or residual between a function of Xi and some prior knowledge zi (e.g. measurements) is Gaussian

(we’ll save this trick for later)

Distributes as Gaussian

Sigma corresponds to the sensor or model covariance

17 of 88

Now it comes the assumptions

17

So, the probability of each factor is given by

18 of 88

Least squares minimization

18

This converts the problem into a (nonlinear) least squares minimization

Gaussian assumption

(Here we ignore some constant terms)

19 of 88

Inspecting the Least Squares (LS) problem

19

This term is usually ignored since Sigma is constant*

*We can keep the term and also optimize it (but it would require other solvers/strategies, such as expectation-maximization). In this case, we would be optimizing the sensor models (“learning the covariances”).

20 of 88

Inspecting the Least Squares (LS) problem

20

These sigmas (measurement/model covariances) are the only uncertainties we plug into the system

Error or Residual of the factor

21 of 88

Inspecting the Least Squares (LS) problem

21

The solution only returns a value but not an uncertainty estimate

(LS is a point estimator)

22 of 88

Solving the Least Squares (LS) problem

22

LS can be solved in closed-form if h(X) is linear

23 of 88

Solving the Least Squares (LS) problem

23

By setting the derivative to zero,

we find the normal equations

24 of 88

Solving the Least Squares (LS) problem

24

By setting the derivative to zero,

we find the normal equations

Information matrix,

Fisher Information,

or Hessian

25 of 88

Analyzing the Information Matrix

25

Obs. 1: Matrix A cannot have zero columns since they can indeterminate the linear system

26 of 88

Analyzing the Information Matrix

26

Obs. 2: The new matrix is denser than

so it’s expensive to invert to solve the linear system

Factoring the matrix and reordering the rows and columns is the usual trick to solve by back-substitution without inverting

iSAM,

qpSWIFT

27 of 88

Analyzing the Information Matrix

27

Obs. 3: The matrix approximates* the information (inverse covariance) of the solution of the LS problem

However, we need to invert the matrix to obtain the covariance of the solution

Marginal information matrices can be obtained, but still require inversion to get covariances

*It approximates the covariance of the solution because we’re not explicitly optimizing the covariance of the solution - this is called a “Laplace approximation”. To optimize the covariance as well we can use variational inference - Barfoot has some recent papers on it.

28 of 88

Nonlinear factor graphs

28

If h(X) is a nonlinear function, we must use a nonlinear optimization algorithm

29 of 88

Nonlinear factor graphs

29

Algorithms such as Gauss-Newton or Levenberg-Marquardt will linearize the system around the initial solution (linearization point)

Jacobian of h(X)

Linearization point

30 of 88

Solving nonlinear factor graphs

30

Applying the linearization we obtain a linear system as before

31 of 88

Normal equations

31

So we can solve it with the same normal equations (with Jacobians instead of the A matrix)

Jacobians must be well-defined (i.e, no zero columns)

Levenberg-Marquardt can solve ill-conditions

(but cannot be used in incremental problems with iSAM)

32 of 88

Updating the solution

32

The solution of the linear system is a small correction to the current linearization point to improve the solution

We relinearize in the new linearization point and repeat until convergence

All the covariances that can be extracted from the information matrix are valid for the linearization point only

33 of 88

Summary

33

Covariance of the solution. Obtained from the information matrix of the linear system

34 of 88

Summary

34

Covariance of the solution. Obtained from the information matrix of the linear system

Factors (measurements and covariances)

are the inputs

Variables

(mean and covariances)

are the outputs

35 of 88

Part 2: Lie groups

36 of 88

Dealing with rotations and transformations

36

When designing a factor graph, we need factors with residual functions that output vector quantities.

Some factors can have rotations or rigid body transformations involved.

37 of 88

3D Rotations

37

Rotations can be represented in many ways, but have 3 inherent DoF

Euler angles (ɑ,β,𝛄)

3x1 vector

Axis-angle (v,θ)

3x1 vector

Quaternions q=(w,x,y,z)

Unit 4x1 vector

Rotation matrices R

3x3 orthogonal matrix

How to measure the difference of rotations? What about poses?

38 of 88

Lie Groups

38

Lie Groups offer a principled way to solve these issues with a common framework

Rotation matrices, quaternions and rigid-body transformations are Lie groups, so the same principles can be applied with all of them.

Special Orthogonal Group

(Rotations)

Special Euclidean Group

(Rigid-body transformations)

39 of 88

Lie Groups, roughly

39

A smooth manifold

A group

A differentiable surface that “looks Euclidean” at any point

A set with a composition operation and basic properties

Closure

Identity

Inverse

Associativity

Solà, Deray, Atchuthan (2018), “A micro Lie theory for state estimation in robotics”, arXiv

40 of 88

Lie Groups

40

The main advantage of Lie Groups is that any element on the manifold can be mapped into a Euclidean vector space (Lie algebra)

So we can use the same tools from vector spaces, plus some extra rules

Lie Group

Lie algebra

41 of 88

Lie Groups - with figures

41

Lie Group

Vector space

“Lie algebra”*

*Note: This is not exactly the Lie algebra, but a mapping can be established through the hat and vee operators

42 of 88

Lie Groups - with figures

42

Lie Group

Vector space

“Lie algebra”

Logarithm map

43 of 88

Lie Groups - with figures

43

Lie Group

Vector space

“Lie algebra”

Logarithm map

Exponential map

44 of 88

Lie Groups - with figures

44

Group defined at the identity

45 of 88

Lie Groups - with figures

45

Group defined at T

46 of 88

Lie Groups - with figures

46

Right-hand convention

Used in GTSAM - Dellaert, Carlone, Scaramuzza, Solà

and DRS

47 of 88

Lie Groups - with figures

47

Left-hand convention

Used by Barfoot, Mangelson, and other authors

48 of 88

Lie Groups - Adjoint operator

48

Left-hand convention

Right-hand convention

49 of 88

Lie Groups - Adjoint operator

49

Left-hand convention

Right-hand convention

50 of 88

Importance of the convention in optimization problems

The Exponential and Logarithm map allow us to map quantities between the Lie group and Lie algebra

We can define residuals for the factor graph using the Logarithm map

Note: The definition of the logarithm map determines the order of the variables

In GTSAM:

  • Pose3 uses (rx, ry, rz, x, y, z)
  • Pose2 uses (x, y, θ) (x, y, θ) -> Thanks to Milad Ramezani for pointing this out!
  • Papers and libraries can vary in this definition too

50

51 of 88

Importance of the convention in optimization problems

The Logarithm map also determines the ordering of the covariance matrices

(input matrices in the factors, and output matrices in the information matrix)

51

Obtained from the information matrix of the linear system

52 of 88

Importance of the convention in optimization problems

The optimization loop also becomes affected.

The linearized linear system is solved in the tangent space, using vectors

However, instead of using the update rule:

We map the correction from the tangent space back on the manifold

(Right hand convention also important here)

52

53 of 88

Graphically

53

1. Linearization of all the factors evaluated at

- “Lifting”

2. Computation of optimization correction using linear system (normal equations)

3.Update the variables back on the manifold - “Retracting”

54 of 88

Part 3: Bringing everything together

55 of 88

A pose graph SLAM problem

Let’s return to modelling estimation problems with factor graphs: a SLAM graph

We’ll focus on how to model the odometry factor

55

T1

T2

T3

Prior factor

Sensor factor

Odometry factor

56 of 88

Odometry factor

The odometry factor establishes the relationship between T1 and T2 given an odometry measurement ΔT. They are poses in SE(3)

It’s straightforward that the equation is:

Here we’re using the right hand notation to apply the measurement… but why?

56

T1

T2

57 of 88

The importance of the frames

We mentioned that GTSAM uses this notation, so it makes sense to follow it

But we need to be completely sure that we are doing the right thing

Here we’re missing an important physical concept: the reference frames

57

58 of 88

The importance of the frames

When we write down this:�

��We are implying the following:

58

59 of 88

Graphically

59

60 of 88

Graphically

60

We are expressing poses of the body B in a fixed frame W

61 of 88

Graphically

61

We are expressing poses of the body B in a fixed frame W

Our odometry measurements are relative with respect to the previous frame B1

62 of 88

Graphically

62

We are expressing poses of the body B in a fixed frame W

Our odometry measurements are relative with respect to the previous frame B1

The resulting expression represent the pose of the body B, in time 2, wrt to the same fixed frame W

63 of 88

The importance of the frames

We can confirm the frames are right:

63

64 of 88

The importance of the frames

We can confirm the frames are right:

64

Subscripts should match between transformations and “eliminate” each other*

* Please note that writing the reference frame on the left is redundant for pose, it is mainly relevant for vectors (Furgale, 2014) -> Thanks to Marco Camurri for mentioning this

65 of 88

Creating Gaussian factors

The odometry expression that we have is consistent with the frames, but it’s not a probability distribution

Hence, we cannot use it as a factor. What can we do?

65

66 of 88

Probability distributions in SE(3)

Let’s recall some slides ago what we did in the linear case

We added a noise term with the desired distribution (zero-mean Gaussian)

Even though we are using Lie groups/manifolds, we can do the same trick

66

67 of 88

Probability distributions in SE(3)

67

68 of 88

Probability distributions in SE(3)

68

We add a zero mean Gaussian distribution defined on the tangent space, and project it back to the group using the Exponential map

69 of 88

Important considerations

69

We apply the noise on the right side to be consistent with right hand notation

The covariance we set here is expressed in base B at time 2

70 of 88

Graphical interpretation of the distribution

70

* If we sample this distribution and plot the poses, we’ll obtain the “banana-shaped” distribution expected for poses

1. We define a distribution in the tangent space

2. We project it back to the group using the Exponential Map to define a distribution on the manifold*

71 of 88

Probability distributions in SE(3)

We isolate the noise on the right to obtain a zero-mean distribution on both sides

71

72 of 88

Probability distributions in SE(3)

We isolate the noise on the right to obtain a zero-mean distribution on both sides

We can check everything is consistent by applying the inverses (which swaps the subscripts and the reference frame)

72

73 of 88

Probability distributions in SE(3)

We isolate the noise on the right to obtain a zero-mean distribution on both sides

And applying the subscript elimination trick

73

An object defined in base B at time 2

74 of 88

Defining Gaussian factors

Now this expression is a Gaussian distribution in SE(3)

Which we can use to define the factor:

This is the definition of the “BetweenFactor” in GTSAM

74

75 of 88

Extracting uncertainties from the solution

From our analysis we can define any probability distribution on SE(3) as

This expression helped us to define Gaussian factors used in the graph.

But it also applies when we want to extract covariances from the solution

75

76 of 88

Extracting uncertainties from the solution

Let’s say we managed to invert the information matrix from the factor graph solution

The inverse matrix will keep all the covariances and cross covariances of the variables involved

In GTSAM, they correspond to covariances in the “base” frame (right hand side)

76

Probability distribution of the solution

77 of 88

Last comments

The definition of the probability distribution is useful to compute other operations and manipulate the uncertainties accordingly. For instance:�

Distribution of the inverse

77

We use the Adjoint to move the Exp( ) to the right

Proper distribution using the right hand convention

78 of 88

Last comments

Distribution of the inverse

78

The covariance gets transformed by the Adjoint and

now it represents the uncertainty in the world frame W

79 of 88

Other operations

A similar analysis and tools (Exponential and Logarithm map, Adjoint) can be used to derive other expressions for distributions*:

  • Composition
  • Difference of distributions
  • Interpolation
  • Averaging
  • Gaussian processes

In general, the means are easy to compute, but the covariances are tricky**

But if we follow the math and conventions, the resulting formulas should match the physical interpretation (i.e, reference frames)

79

*Not covered here but papers are attached in the end (happy to discuss them as well!)

**Some properties of the Exponential map do not follow the usual properties of the exponential function:

80 of 88

Conclusions

81 of 88

Conclusions

1. Lie groups are useful to unify many operations we usually do*

  • The Logarithm map allows us to compute vector differences
    • Between poses, rotations and other objects
    • Useful to compute residuals for both estimation and control
    • As long as a Logarithm map exists, we don’t have to care about manually choosing how to subtract quantities
  • We can generate any element of a Lie Group from a vector, through the Exponential map
    • So we can apply vector corrections directly to group elements

81

*GTSAM is actually based on manifolds and retractions (a more general view)

82 of 88

Conclusions

2. Conventions are super important

  • Frame conventions
  • Right-convention for Lie groups
    • It’s related to our convention for the frames (world on the left, base on the right)
  • Logarithm map

They allow us to make sense of the quantities we plug in and extract from our estimation problems.

Simple example of potential problems:

  • Covariances in GTSAM’s Pose3: orientation, then position
  • Covariances in ROS’ PoseWithCovariance: position, then orientation

82

83 of 88

Conclusions

3. Conventions are not necessarily well-documented

  • From the papers we can imply who uses which convention
  • For the implementations is better to check the Logarithm map and how they apply the Exponential map (left or right)
  • Covariance handling is not well documented in general
    • Fun fact: We were having this conversation in the gtsam-users group

83

84 of 88

Resources - papers

Lie Groups

  • Solà, Deray, Atchuthan (2018), “A micro Lie theory for state estimation in robotics”, arXiv
    • Complete and short introduction. It covers the main theoretical aspects and some practical estimation examples. Right-hand convention
  • Dellaert (2020), GTSAM docs
    • The documentation of GTSAM has many documents on Lie groups and optimization on manifolds. Right-hand convention
  • Lynch and Park (2016), "Modern Robotics: Mechanics, Planning, and Control", Cambridge University Press
    • Available online
    • Explains Lie groups in the context of kinematics and dynamics
    • It uses both conventions but it’s explicit in declaring which one is being used

84

85 of 88

Resources - papers

Manipulating uncertainty on Lie Groups

85

86 of 88

Resources - libraries

Other papers

  • Calinon (2020), “Gaussians on Riemannian Manifolds: Applications for Robot Learning and Adaptive Control”, ICRA
    • Follows a similar approach to define Gaussians on Riemannian Manifolds
    • Their Gaussian distribution is different
    • Riemannian manifolds seems to be more general than Lie Groups (I’m not clear about the technical differences yet)

86

87 of 88

Resources - libraries

Lie Group libraries

  • GTSAM (https://github.com/borglab/gtsam/ )
    • All their data structures are defined as manifolds, hence the operators (Exp, Log, Ad, etc) are defined. Right-hand convention
  • Pinocchio (https://github.com/stack-of-tasks/pinocchio )
    • Implements Lie Groups structures for their computations. Right-hand convention
  • Lie groups in Python (https://github.com/utiasSTARS/liegroups )
    • Implementation for numpy and Pytorch. Left-hand convention

87

88 of 88

Handling Uncertainty in

Estimation Problems with Lie Groups

Matías Mattamala

matias@robots.ox.ac.uk

25/01/2020