1 of 14

Automatic Human Body Part Segmentation

Bachelor Thesis

Author: Jakub Toma

Supervisor: Mgr. Dana Škorvánková 2022

2 of 14

Aim of the Thesis

  • Examine human body part segmentation task
  • Preprocess input data
  • Implement existing architectures
  • Train models
  • Evaluate results

2

Neural network

Input data

Output segmentation

3 of 14

Motivation

  • Still a challenge for researchers (human poses, point of view, body shapes, lighting, clothing, environment etc.)
  • Analytical approaches → Machine learning
  • Use in various spheres
    • Robotic
    • Pose estimation
    • Physiotherapy
    • Gaming

3

4 of 14

UBC3V Dataset

  • Purely synthetic data
  • Depth maps and point clouds
  • 3 sub-datasets

4

https://arxiv.org/abs/1605.08068

5 of 14

Data Preprocessing

  • Depth maps resizing
  • Point clouds down-sampling
  • Min-max normalization
  • One-hot encoding

5

Down-sampling

Annotation

Original point cloud

Down-sampled point cloud

Annotated down-sampled point cloud

6 of 14

Adapted Alexnet

  • Alexnet
  • Input data shape 224x224x1
  • 46 output classes
  • Slope deconvolution
  • + Dropout (overfitting reduction)

6

Layers

Parameters

Activation

conv_1

Convolution (k=11, s=4, o=32) + Batch Normalization

ReLU

pool_1

MaxPooling (pool_size=3, s=2)

 

conv_2

Convolution (k=5, s=1, o=64) + Batch Normalization

ReLU

pool_2

MaxPooling (pool_size=3, s=2)

 

conv_3

Convolution (k=3, s=1, o=64) + Batch Normalization

ReLU

conv_4

Convolution (k=3, s=1, o=128) + Batch Normalization

ReLU

conv_5

Convolution (k=3, s=1, o=128) + Batch Normalization

ReLU

pool_3

MaxPooling (pool_size=3, s=2)

 

conv_6

Convolution (k=6, s=1, o=256) + Dropout (0.2)

ReLU

conv_7

Convolution (k=1, s=1, o=256) + Dropout (0.2)

ReLU

conv_8

Convolution (k=1, s=1, o=128)

ReLU

upfeat_1

Deconvolution (k=4, s=2, o=128)

 

upfeat_1

Deconvolution (k=2, s=2, o=64)

 

upfeat_1

Deconvolution (k=2, s=2, o=64)

 

upfeat_1

Deconvolution (k=4, s=4, o=32) + Cropping

 

score

Convolution (k=1, s=1, o=46)

Softmax

Adapted Alexnet architecture

7 of 14

Training & Evaluation

  • Accuracy overall: 82.2%
  • Loss overall: 0.53
  • Performance metric: Mean IoU

7

Two Pairs: Ground truth & Predicted segmentation

8 of 14

PointNet

  • Input point cloud = 2048 points
  • No background and edge points
  • 45 output classes
  • + 1 MLP
  • + Dropout

8

PointNet architecture

9 of 14

Training & Evaluation

  • Accuracy overall: 85.4%
  • Loss overall: 0.44
  • Performance metric: Accuracy

9

Ground truth

Predicted segmentation

10 of 14

Summary

  • Various modifications on existing approaches
  • Depth maps and point clouds from the same dataset – hard pose
  • Overall accuracy > 82%
  • Errors at the region edges

10

Model Architecture

Overall Accuracy

Shafaei16 – Net 3

80.6%

Ours – Adapted Alexnet

82.2%

Ours – PointNet

85.4%

11 of 14

Limitation

  • Region boundaries labeled as

background

11

Future Work

  • Drop edge pixels from input ground truths
  • Implement more sophisticated architectures
  • Test model trained on real life data

12 of 14

Opponent’s Questions

  • Q: Pri návrhu riešenia spomínate modifikácie existujúcich riešení. Akým spôsobom ste postupovali pre výber jednotlivých modifikácii?
  • A: Zvolil som si 2 architektúry – priekopníkov v oblasti a vykonal modifikácie
    • zjemnenie procesu dekonvolúcie
    • zmena vrstiev modelov a parametrov

  • Q: Overovali ste funkčnosť Vami natrénovanej siete aj na reálnych dátach?
  • A: Pracoval som čisto so syntetickým datasetom na ktorom som vykonával trénovanie modelu a zároveň evaluáciu. Myšlienku testovania na reálnych dátach popisujem v kapitole Future Work.

12

13 of 14

Supervisor’s Questions

  • Q: Skúste popísať, akými spôsobmi by ste vedeli zabezpečiť/zlepšiť generalizáciu metód na reálne dáta?
  • A: Regularizácia – aplikovanie metód za účelom zníženia overfitting efektu / zníženie veľkosti váh.
    • Dropout – deaktivácia náhodne zvolených neurónov.
    • Early stopping – zastavenie procesu trénovania ak je validation error na minime.
    • Augmentácia dát – rozšírenie trénovacej množiny dát o rôzne variácie.
    • L1, L2 regularizácia – súčet absolutných hodnôt (L1) / druhých mocnín (L2) váh sa pripočítava ku strate
      • Alpha parameter – určuje veľkosť penalizácie

13

14 of 14

Thank you for your attention

14