Applied Foundation Models
Practical Course -- Kick-Off Meeting
April 28th, 2025
Computer Vision Group
Felix Wimbauer, Dominik Muhle, Christoph Reich, and Daniil Zverev
Outline
2
Dominik Muhle
Info
Research Interests
Website: dominikmuhle.github.io
3
Felix Wimbauer
Info
Research Interests
Website: fwmb.github.io
4
Christoph Reich
Info
Research Interests
Website: christophreich1996.github.io
5
Daniil Zverev
Info
TUM, Yandex Data School
Research Interests
Website: https://akoepke.github.io/mumol.html
6
Introduction
Foundation Models
This Course
Relevance
7
Course Structure
8
Sessions & Important Dates
9
Project Assignment
10
GPU Infrastructure
11
Grading
12
Initial Presentation
13
Projects
14
#1 Driving Scenario Reconstruction
15
#2 VLM-based Image or Video Search
16
#3 VLM based AI Tutor
17
#4 Image-Editing with Foundation Models
18
#5 Video Editing with Foundation Models and Rendering
Setting:
Goal: Build powerful video editing pipeline with SAM2, FoundationPose, and CoTracker
19
FoundationPose by Nvidia
SAM2 by Meta AI
CoTracker by Meta AI
#6 Search in 3D Room
Setting:
Goal: Build a search engine for 3D scans of your room.
20
Mast3r-SLAM by ICL
CLIP by OpenAI
#7 Unsupervised Point Cloud Segmentation
Setting:
Goal: Build a foundation model approach for unsupervised segmentation of point clouds in the wild.
21
Wu et al. “Sonata: Self-supervised learning of reliable
point representations”, CVPR, 2025
#8 Bringing Unsupervised Whole-Image Segmentation to Videos
Setting:
Goal: Finetune UnSAM for unsupervised whole-video segmentation
22
Wang et al. “Segment Anything without Supervision”, NeurIPS, 2024
#9 Build a Kicker Tracker and Commentator
Setting:
Goal: Build software for automatically analyzing the game and automatically commentate.
23
What’s next?
Teams & Project Preferences
Initial Presentation
Questions?
24