1 of 25

Automated aircraft inspection through image-to-CAD model registration

Bharath Somayajula

Niviru Wijayaratne

Sponsor : Near Earth Autonomy

2 of 25

The Team

Students

Bharath Somayajula, Graduate Student, CMU
Niviru Wijayaratne , Graduate Student, CMU

Industry Advisors

Dr. Sanjiv Singh, CEO, Near Earth Autonomy
Dr. Dennis Strelow, Scientist, Near Earth Autonomy
Dr. Marcel Bergerman, COO, Near Earth Autonomy

CMU Advisor

Dr. Ioannis Gkioulekas, Assistant Professor, CMU

3 of 25

Motivation

Aircraft maintenance market was valued at $87.01 billion in 2021
Aircraft inspection is tedious and often inefficient
Near Earth Autonomy drones used in Boeing’s programs for remote inspection
Boeing C-17 is used by airforces of US, UK, Australia and India

4 of 25

Problem Statement

Build an automated system for aircraft inspection through image-to-CAD model registration

5 of 25

Problem Statement: Pose Estimation

RGB Image(s)

Untextured CAD Model in Canonical Pose

Untextured CAD Model in Correct Pose

6 of 25

Problem Statement: Texture Mapping

RGB Image(s)

Untextured CAD Model in Correct Pose

Textured CAD Model in Correct Pose

7 of 25

Proposed Solution: Original

Keypoint

Prediction

Network

Post

Processing

Narrow

FOV

Wide

FOV

Image Capture

8 of 25

Proposed Solution: Updated

Post

Processing

[q, C]

9 of 25

Proposed Solution: Approach Comparison

Keypoint Based Approach

[q,C]

Direct Regression

10 of 25

Dataset: Choices

Real data is difficult to acquire
Synthetic Data

C17 Model
Background options

Background Options	High Fidelity	Low Cost	Scalable
None

11 of 25

Dataset: Choices

Real data is difficult to acquire
Synthetic Data

C17 Model
Background options

Background Options	High Fidelity	Low Cost	Scalable
None
3D

12 of 25

Dataset: Choices

Real data is difficult to acquire
Synthetic Data

C17 Model
Background options

Background Options	High Fidelity	Low Cost	Scalable
None
3D
Random 2D	?

13 of 25

Dataset: Pipeline

Input: CAD Model + Random Background

Render RGB, Depth, Mask

Generate Extrinsic Matrix

Sample Camera Position

Sample point on mesh

Compose with Random Background

14 of 25

Dataset: Background Image Examples

Fig: Examples of background images used during training

15 of 25

Dataset: Examples

Fig: Sample images produced by dataloader

16 of 25

Model: Architecture

Fig: Model architecture [1]

17 of 25

Quaternions

Rotation Matrices

Enforcing orthogonality and det(R)=1 constraints

Euler’s Angles

Gimbal lock

Axis-Angle

Ambiguous axis of rotation when angle=0

Quaternions

Represents angles using a vector of size 4
Avoids issues with other representations

Fig: Gimbal lock

18 of 25

Model: Training Details

Parameter	Value
Architecture	ResNet-34 backbone with a new FC layer
Learning Rate	0.001
Batch Size	16
Epochs	50
Resolution	288 x 512 (9:16)
Loss Function	L1 Loss

Table: Hyper-parameters used to train pose estimation network

19 of 25

Results: Qualitative Evaluation

Fig: Top row shows the input images for the pose prediction network and the bottom row shows the rendered RGB images of 3D model from predicted poses

20 of 25

Results: Quantitative Evaluation

Metric	Value
Position Error	1.44 meters
Angular Error	5.38degrees

Table: Evaluation metrics on held-out data

Position Error: Average distance between predicted and true camera positions

Angular Error: Average angle between predicted and ground-truth rotations applied

21 of 25

Results: Inference

Metric	Value
Number of Parameters	21.3 million
Inference Time	2.7 ms �(at full precision on RTX 3090Ti)
FLOPS	38.4 billion

Table: Inference metrics for pose estimation network

22 of 25

Sim-to-Real Gap

No real data for training or testing due to ITAR restrictions
Potential Solutions:

Better rendering pipelines to render specular reflections

Difficult to scale backgrounds

Using NeRF on toy aircraft models to generate real data

Non-representative backgrounds

Annotate real images available on internet

Covariate shift

23 of 25

Timeline and Future Work

October 1st - October 22nd

Improve data generation pipeline (Bharath & Niv)

October 23rd - October 31st

Experiment with loss functions (Bharath)

November 1st - November 15th

Iterative Pose Refinement[2] (Bharath & Niv)

November 15th - November 30th

Texture Mapping (Niv)

24 of 25

References

Xiang, Y., Schmidt, T., Narayanan, V. and Fox, D., 2017. Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. arXiv preprint arXiv:1711.00199.
Trabelsi, Ameni, et al. "A pose proposal and refinement network for better 6d object pose estimation." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2021.
Code: https://github.com/thebharathsk/16621

1 of 25

2 of 25

3 of 25

4 of 25

5 of 25

6 of 25

7 of 25

8 of 25

9 of 25

10 of 25

11 of 25

12 of 25

13 of 25

14 of 25

15 of 25

16 of 25

17 of 25

18 of 25

19 of 25

20 of 25

21 of 25

22 of 25

23 of 25

24 of 25

25 of 25