⚡️ Lightning Talks #1 🚀
one slide in one minute, let's go!
2.73× −7.63× speedup while retaining 98.6% − 99.6% of
the original accuracy!
Goal: To accelerate the self-attention computation especially when sequence length is long.
Safe Learning in the Real World via Adaptive Shielding with Hamilton-Jacobi Reachability�Michael Lu, Jashanraj Singh Gosain, Luna Sang, Mo Chen
Problem: Standard RL algorithms can be unsafe during learning since they might violate safety constraints while learning
Solution: Construct a "safety filter" using Hamilton-Jacobi Reachability that works with any off-policy RL algorithm
Contribution: Our shield adapts to model uncertainty. When �the real system differs from our model, the safety filter �becomes more conservative
Results: Fewer safety violations compared to fixed safety �filters that don't adapt. Successfully tested on navigational �tasks with minimal human intervention
Michael Lu
3D visual grounding models perform well on existing datasets…
…but do these datasets really capture the full range of visual grounding descriptions?
ScanRefer
there is a microwave on a countertop. it is in the corner.
ViGiL3D
Grab the box on the counter that appears taller than the rest.
Come learn about how ViGiL3D helps us understand and improve 3DVG models through diverse language!
ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding
Eye-Opening Open-Set Adaptation Analysis
Test-time Adaptation:
Open-Set Experiments:
Continual Adaptation:�changing shifts
data from unknown classes:�houses in testing data, outdoors in training data
Zefeng Li12, Evan Shelhamer12
UBC1 Vector Institute2
first train model on�standard/clean data
example test data: image corruptions like noise, blur, weather, digital
Closed-set (In Distribution = InD)
then adapt on test data�that it is new and different
Open-set (Out of Distr. = OOD)
How to make models adapt to unknown shifts and classes?
100% InD
100% OOD
Batch Mixtures:
more / less InD vs. ODD
time
…
Snow/InD
Brightness/InD
Contrast/OOD
Existing generative models fail on small datasets
Checkout out our poster: Rejection Sampling IMLE
Dataset
Diffusion models
GAN
RS-IMLE (ours)
Toy 2D
FFHQ 100
Adaptive Randomized Smoothing:
Certified Adversarial Robustness for Multi-Step Defence
Saiyue Lyu1*, Shadab Shaikh1*, Frederick Shpilevskiy1*, Evan Shelhamer2, Mathias Lécuyer1
University of British Columbia1 Google DeepMind2
CelebA Result
Paper Code
Adaptive Diffusion Denoised Smoothing :
Certified Robustness via Randomized Smoothing with
Differentially Private Guided Denoising Diffusion
Frederick Shpilevskiy1, Saiyue Lyu1, Krishnamurthy Dj Dvijotham2, Mathias Lécuyer1, Pierre-Andre Noel2
PUT Workshop Oral
1
2
ICML 2025: Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Sparse Training and Lottery Ticket Hypothesis (LTH):
Method:
"We found that LTH masks fail to generalize to new random initializations due to loss basin misalignment. To reuse an LTH mask with a different random initialization, we leverage permutation symmetries, to permute the mask to align with the new random initialization optimization basin."
Source: Visualization of a 2D loss landscape; of (left) dense training & pruning. (right) permuting the mask enables sparse training.
Result:
sparsity = 0.80
sparsity = 0.90
sparsity = 0.95
sparsity = 0.97
We find that a sparse model (with permuted mask) with new random initialization, can nearly match LTH solution performance.
LTH showed that there exists a sparse sub-network that can match dense performance.
However, finding LTH sparse mask is computationally expensive and doesn’t work with new random inits.
However, finding LTH sparse mask is computationally expensive and doesn’t work with new random inits.
Research Question:
How can we train a LTH mask from a different random initialization while maintaining good generalization?
Research Question:
We show that by leveraging weight symmetries LTH mask can be used with new random inits.
Chehre: Understanding the Language of Facial Expressions
Bita Azari, Zoe Stanley, Avneet Batra, Poorvi Bhatia, Hali Kil, Manolis Savva, Angelica Lim
Simon Fraser University (SFU)
Flattered
saddened
I’m mad!
What does this emoji mean to you?
Please express the emoji
Contact Me: bazari@sfu.ca
What can satellite imagery and machine learning measure?
Results
Jonathan Proctor, Tamma Carleton, Trinetta Chong, Taryn Fransen, Simon Greenhill, Jessica Katz, Hikari Murayama, Luke
Sherman, Jeanette Tseng, Hannah Druckenmiller, Solomon Hsiang
Approach: We conduct 115 standardized large-scale experiments using a composite high-resolution optical image of Earth and a generalizable and accessible SIML technology to evaluate which ground conditions can be accurately measured and where this technology struggles.
Reliable ML against Faulty Training Data
Label: Stop sign
Problem
Label: Pneumonia
Medical
Insight: Diversity increases reliability!!
AVs
Our Solutions
Results
Dynamic weights using XAI
Ensembles
Diversity-guided Ensemble Search
Most resilient,
minimal effort
28% more resilient
24% more resilient
Training Data
Abraham Chan, UBC (abrahamc@ece.ubc.ca)
Lifelong Learned Video Diffusion Models Work
Yoo, J., He, Y., Naderiparizi, S., Green, D., Ven, G.M.V.D., Pleiss, G., Wood, F. (2024). Lifelong Learning of Video Diffusion Models From a Single Video Stream. arXiv preprint arXiv:2406.04814.
Dataset 1
Ground Truth
Sample 1
Sample 2
Dataset 2
Paper
We Define and Analyze Memorization in Novel Models
When Backtracking isn’t Reasoning
Yunpeng Liu1,2 Berend Zwartsenberg1 Frank Wood1,2,3
Do imitative backtracking reasoning models improve reasoning scores by making corrections?
Loss mask for imitative backtracking
Random replacements yield similar performance as model-generated human-like mistaken reasoning steps.
The reasoning score increases as the number of random reasoning attempts followed by BACK tokens increases.
Inverted AI1, University of British Columbia2, Amii3
yunpengl@cs.ubc.ca
Visual-concepts driven Image Generation
How can we create embodied artificial intelligence agents capable of interacting naturally and purposefully with humans in complex, open-ended environments?
Ongoing: Developing real-time EAI agents
PLAICRAFT.AI
(Join & Plai!)
https://blog.plaicraft.ai/
79%
48%
24%
Only 24% of Canadians received any AI training
Situation
Solving Canada’s Technical AI Safety Talent Gap
Our Proposal
Strategic Benefits
79% of Canadians are concerned about the negative outcomes of AI
Canada ranked 44th out of 47 in AI literacy
Almost half of managers and executives feel their employees are not/barely prepared to use AI
44
47
/
Cole Thacker
cole_thacker@sfu.ca
Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow
Anthony Fuller�Carleton PhD student�+ Vector intern��LookWhere�Poster Session #1
LookWhere: Efficiency�with Self-Supervised Adaptive Computation
A. Fuller*1, Y. Yassin*1, J. Wen1, D.G. Kyrollos1,�T. Ibrahim1, J.R. Green†1, E. Shelhamer†123
Carleton1 UBC2 Vector Institute3
Multi-Modeling for Efficiency, Adaptivity, and Uncertainty
Tim G. Zhou�UBC MSc student�(incoming)��Asymmetric Duos�Away—another time!
T.G. Zhou1, E. Shelhamer12, G. Pleiss12
UBC1 Vector Institute2
Asymmetric Duos: Uncertainty�with the Help of a Sidekick
Zefeng Li�UBC PhD student�+ Vector student��Open-Set Adaptation�Poster Session #1
Zefeng Li12, E. Shelhamer12
UBC1 Vector Institute2
Open-Set Updates: Adaptivity�with More Analysis and Benchmarks
known classes + unknowns.