1
Origami Sensei: Mixed reality AI-assistant for creative tasks using hands
1
1
1
1
Team
Advisors
MSCV students
2
Motivation
3
Challenges/Design decisions
4
5
Level: 0
Level: 100
Level: 30
Data collected and annotated
Data augmentation
Accuracy on test (Internet video) ~ 70%
https://apps.apple.com/us/app/how-to-make-origami/id472936700
6
Current setup
Classify
Give feedback
Summary
7
Paper 1:
Existing pipeline
Our Pipeline
Improve step recognition
Inspiration on pipeline
Improve feedback
Paper 2:
Step recognition
Paper 3:
Hand tracking
Summary
8
Paper 1:
Existing pipeline
Our Pipeline
Improve step recognition
Inspiration on pipeline
Improve feedback
Paper 2:
Step recognition
Paper 3:
Hand tracking
Paper 1: Anomaly Detection of Folding Operations for Origami Instruction with Single Camera (IEICE 2020)
9
Anomaly Detection of Folding Operations for Origami Instruction with Single Camera. Hiroshi Shimanuki, Toyohide Watanabe, Koichi Asakura, Hideki Sato, Taketoshi Ushiama. IEICE 2020.
History
of
this
group
10
Pipeline
11
Input
Output
Pipeline
12
Input
Output
Manual instruction model construction
13
Folding support for beginners based on state estimation of Origami. Toyohide Watanabe and Yasuhiro Kinoshita. TENCON 2012. https://ieeexplore.ieee.org/document/6412167
Predefined types of folds
Generate silhouette
Pipeline
14
Input
Output
Pipeline
15
Input
Output
Segmentation
16
Folding support for beginners based on state estimation of Origami. Toyohide Watanabe and Yasuhiro Kinoshita. TENCON 2012. https://ieeexplore.ieee.org/document/6412167
Pipeline
17
Input
Output
Pipeline
18
Input
Output
Position & rotation estimation
19
Folding support for beginners based on state estimation of Origami. Toyohide Watanabe and Yasuhiro Kinoshita. TENCON 2012. https://ieeexplore.ieee.org/document/6412167
Pipeline
20
Input
Output
Pipeline
21
Input
Output
State recognition + binary SVM mistake detection
22
https://appliedmachinelearning.wordpress.com/tag/hyperplane-svm/
Pipeline
23
Input
Output
Pipeline
24
Input
Output
Provide Instruction
25
Folding support for beginners based on state estimation of Origami. Toyohide Watanabe and Yasuhiro Kinoshita. TENCON 2012. https://ieeexplore.ieee.org/document/6412167
Pipeline
26
Input
Output
State recognition performance (good)
27
Folding support for beginners based on state estimation of Origami. Toyohide Watanabe and Yasuhiro Kinoshita. TENCON 2012. https://ieeexplore.ieee.org/document/6412167
SVM Performance (not very good)
28
Limitations
29
Limitations
30
Pipeline Comparison
31
| Fixed set-up (e.g. paper/table color) | Automatic step recognition | Use Neural Networks | Origami step recognition method | Manual instruction construction | How to give instruction? |
Hiroshi et al. (2020) | Yes | Yes | No | Silhouette IoU + Argmax (=> limitation on views and steps) | Yes (built a software) | Visual overlay |
Our project | No | Yes | Yes | Multi-class classification network (see paper 2) | Yes | Visual overlay (via projector later) + hand guidance (see paper 3) |
Summary
32
Paper 1:
Existing pipeline
Our Pipeline
Improve step recognition
Inspiration on pipeline
Improve feedback
Paper 2:
Step recognition
Paper 3:
Hand tracking
Paper2: Temporal Action Segmentation from Timestamp Supervision (CVPR 2021)
33
Temporal Action Segmentation from Timestamp Supervision. Zhe Li, Yazan Abu Farha, Juergen Gall. CVPR 2021.
Temporal Action Segmentation from Timestamp Supervision (CVPR 2021)
34
Temporal Action Segmentation from Timestamp Supervision. Zhe Li, Yazan Abu Farha, Juergen Gall. CVPR 2021.
Definition
35
Definition
36
Motivation
“annotators need 6 times longer to annotate the start and end frame compared to annotating a single timestamp”
37
Fan Ma, Linchao Zhu, Yi Yang, Shengxin Zha, Gourab Kundu, Matt Feiszli, and Zheng Shou. SF-Net: Single-frame supervision for temporal action localization. In European Conference on Computer Vision (ECCV), 2020
Novelty
38
Method
39
Input
Video
Timestamp Annotation
Method
40
Input
Method
41
Loss Definition
42
Loss Definition
43
Performance
44
Performance: comparable to fully supervised models
45
Performance: agnostic to the segmentation model
46
Relation to our project:
47
Summary
48
Paper 1:
Existing pipeline
Our Pipeline
Improve step recognition
Inspiration on pipeline
Improve feedback
Paper 2:
Step recognition
Paper 3:
Hand tracking
49
Romero, Javier, Dimitrios Tzionas, and Michael J. Black. 2017. “Embodied Hands.” ACM Transactions on Graphics 36 (6): 1–17.
MANO: Parametric model for hands
Artist-defined hand mesh with joints and blend weights
Disentangle pose and shape space
Paper:3 RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video (SIGGRAPHAsia2020)
50
Wang, Jiayi, Franziska Mueller, Florian Bernard, Suzanne Sorli, Oleksandr Sotnychenko, Neng Qian, Miguel A. Otaduy, Dan Casas, and Christian Theobalt. ‘RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video’. ACM Transactions on Graphics (TOG) 39, no. 6 (12 2020).
51
Motivation
Tracking two interacting hands in real-time using monocular RGB video
Key-point detection
Dense 2D fitting
Inter-hand and Intra-hand distance
52
RGB Image
Two-hand tracking
MANO pose and shape parameters
Wang, Jiayi, Franziska Mueller, Florian Bernard, Suzanne Sorli, Oleksandr Sotnychenko, Neng Qian, Miguel A. Otaduy, Dan Casas, and Christian Theobalt. ‘RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video’. ACM Transactions on Graphics (TOG) 39, no. 6 (12 2020).
Method
53
Image fitting loss
Wang, Jiayi, Franziska Mueller, Florian Bernard, Suzanne Sorli, Oleksandr Sotnychenko, Neng Qian, Miguel A. Otaduy, Dan Casas, and Christian Theobalt. ‘RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video’. ACM Transactions on Graphics (TOG) 39, no. 6 (12 2020).
54
Wang, Jiayi, Franziska Mueller, Florian Bernard, Suzanne Sorli, Oleksandr Sotnychenko, Neng Qian, Miguel A. Otaduy, Dan Casas, and Christian Theobalt. ‘RGB2Hands: Real-Time Tracking of 3D Hand Interactions from Monocular RGB Video’. ACM Transactions on Graphics (TOG) 39, no. 6 (12
Results
Summary
55
Paper 1:
Existing pipeline
Our Pipeline
Improve step recognition
Inspiration on pipeline
Improve feedback
Paper 2:
Step recognition
Paper 3:
Hand tracking
Q & A
56
Thank you! Happy spring break!
57