Martial Arts Meets Machine Learning: Recognizing Judo Throws with MMAction2
June 2nd, 2023
Habeeb Shopeju
A Judo Primer
Teddy Riner during a throw.
A Judo Primer
The Gentle Way
Mini-Glossary
Mini-Glossary Cont’d
Throw Examples
Uchi Mata
O Soto Gari
Stair-step Approach to a Throw
4. Execution (Kake)
3. Setting up (Tsukuri)
2. Off-Balancing (Kuzushi)
1. Grip (Kumi Kata)
More Throw Examples
Tomoe nage
Seoi nage
Action Recognition
Action Recognition
Object Detection
Object Detection for Action Recognition
Object Detection for Action Recognition
A: The door is being opened.
B: The door is being closed.
Object Detection for Action Recognition
A: The door is being opened.
B: The door is being closed.
Action Recognition
Models: C3D
Learning Spatiotemporal Features with 3D Convolutional Networks (CVPR’2014)
Key Ideas
Models: TSN
Temporal Segment Networks for Action Recognition in Videos (ECCV’2016)
Key Ideas
Models: TimeSformer
Is Space-Time Attention All You Need for Video Understanding? (ICML’2021)
Key Ideas
MMAction2
MMAction2
OpenMMLab Ecosystem
OpenMMLab Ecosystem
Config Fundamentals: Key-value pairs
# random_config.py
dataset_type = 'VideoDataset'
data_root = 'data/kinetics400/videos_train'
file_client_args = dict(io_backend='disk')
train_pipeline = [
dict(type='DecordInit', **file_client_args),
dict(type='DecordDecode'),
dict(type='Resize', scale=(224, 224), keep_ratio=False),
dict(type='Flip', flip_ratio=0.5),
dict(type='PackActionInputs')
]
train_dataloader = dict(
batch_size=32, num_workers=8,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(type=dataset_type, data_prefix=dict(video=data_root), pipeline=train_pipeline)
)
Config Fundamentals: Dot notation
>>> from mmengine.config import Config
>>> cfg = Config.fromfile('random_config.py')
>>> cfg.dataset_type
'VideoDataset'
>>> cfg.train_dataloader.dataset.data_prefix
{'video':'data/kinetics400/videos_train'}
>>> cfg.train_pipeline[2].scale
(224,224)
Config Fundamentals: Modular and inheritance design
# small_random_config.py
_base_ = ["random_config.py"]
dataset_type = "PoseDataset"
train_pipeline = _base_.train_pipeline
train_pipeline[2]["scale"] = (448, 448)
# Python IDLE
>>> cfg = Config.fromfile('small_random_config.py')
>>> cfg.dataset_type
'PoseDataset'
>>> cfg.train_pipeline[2]
{'type':'Resize','scale':(448,448),'keep_ratio':False}
>>> cfg.file_client_args
{'io_backend':'disk'}
Data Pipeline: Config
train_pipeline = [
dict(type='DecordInit', **file_client_args),
dict(type='UniformSample', clip_len=8, num_clips=1),
dict(type='DecordDecode'),
dict(type='Resize', scale=(-1, 256)),
dict(type='PytorchVideoWrapper', op='RandAugment', magnitude=7, num_layers=4),
dict(type='RandomResizedCrop'),
dict(type='Resize', scale=(224, 224), keep_ratio=False),
dict(type='Flip', flip_ratio=0.5),
dict(type='FormatShape', input_format='NCTHW'),
dict(type='PackActionInputs')
]
Data Pipeline: APIs
Dataloader: Config
train_dataloader = dict(
batch_size=32,
num_workers=8,
persistent_workers=True,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type=dataset_type,
ann_file=ann_file_train,
data_prefix=dict(video=data_root),
pipeline=train_pipeline
)
)
Model: Config
model = dict(
type='Recognizer2D',
backbone=dict(
type='ResNet',
pretrained='https://download.pytorch.org/models/resnet50-11ad3fa6.pth',
depth=50,
norm_eval=False),
cls_head=dict(
type='TSNHead',
num_classes=400,
in_channels=2048,
...),
...)
Model: APIs
Other APIs
Training & Inference
# Terminal
$ python mmaction2/tools/train.py random_config.py
# Python IDLE
>>> config_file = 'random_config.py'
>>> checkpoint_file = 'best_checkpoint.pth'
>>> video_file = "video.mp4"
>>> model = init_recognizer(config_file, checkpoint_file, device='cuda:0') # cpu
>>> pred_result = inference_recognizer(model, video_file)
Training a Model on Kinetics-Tiny
Details
Dataset: Kinetics400-Tiny
Installation and Configuration Files
Getting the Data
Config File
Config File
Training a model
Explaining Dimension Mismatch
Model Inference
Predicting Throw
Training a Model on Judo-Tiny; DIY
Details
Dataset: Judo-Tiny
Installation
Downloading Configuration and checkpoint files
Getting the data
Getting the data
Getting the data
Config File
Config File
Config File
Training a model
ImgAug Documentation
Loading the Model
Predicting throw
Thank You