Video segmentation…
Is optical flow algorithm doing all the work?
… in a fully self supervised manner
Outline
Method
Slot attention module[1]
Slot attention module[1]
Maps from a set of N input features to a set of K slots
[1] Object-Centric Learning with Slot Attention, F. Locatello, D. Weissenborn, T. Unterthiner, A. Mahendran, G. Heigold, J. Uszkoreit, A. Dosovitskiy, T. Kipf
Hard clustering
Soft clustering
Slot attention iteration
Slot attention iteration
1 iteration of slot attention
Maps from a set of N input features to a set of K slots
Slot attention iteration
“Cross Attention”
GRU
MLP
̰
Slot update using “Cross Attention”
“Cross Attention”
Slot update using “Cross Attention”
“Cross Attention”
Recurrent Unit for the output slot
Reconstruction Loss
Decoder:�From each slot/layer i, it reconstructs:
�Final reconstruction image
Broadcasting:
Slot attention module results, multi iterations
Challenges
Challenges:
Idea:
New Pipeline
Reconstruction
From each slot/layer i, it reconstructs:
Additional losses
Encourage the masks to be binary
Encourage temporal consistency
Final loss
Full pipeline
Results
Ablation on the loss
Ablation on the optical flow
Results
Results on MoCA
Limitations
Other ablations