The Blue Sky Research in LLM and Alignment
Soujanya Poria
On behalf of DeCLaRe Lab
POS
HMM/CR
NER
MT
Tag
NLI
TC
Finance
Search/IR
Social Media
Chatbots/
Assistants
Anomaly
Medical
2023 <
2023
2024
Fundamental
Applications
Chatbots
Maths,
Coding
Prompting
Search/IR
LLMs
PPO
DPO
CoT
LLMs
Merge
Distil
k-BitsQ
Reason
Align
LLMs
Navigation
Agents
AutoEval
Search/IR
LLMs
Why Alignment is Needed?
Preference Alignment: RLHF
Alignment is Often Defined as
Helpfulness – all types of capabilities required to help the user.
Harmlessness – all types resistance against harmful intent and affecting users with harmful information
- Trustworthiness.
- Safety.
- Privacy.
Understanding Reasoning Bottlenecks
Hong et al. Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions.
Helpfulness
Are LLMs Math Marvels?
Testing LLMs’ Math and Coding Competency
You perhaps heard about GSM-Symbolic but did not hear about this work 😓�This work came 10 months before GSM-Symbolic
Original | Logic Alteration | Concept Analysis | Format Constraint |
| |||
Original Question |
| Math | |
Question: John has 3 boxes. Each box is 5 inches by 6 inches by 4 inches. The walls are 1 inch thick. What is the inner volume of all 3 boxes? Answer:�Walls are 1 inch thick, reducing each dimension by 2 inches. Thus, the internal dimensions become 3x4x2=24 cubic inches, making the total volume for all 3 boxes 3×24=72 cubic inches. | Question - Variable Relationship: John has X boxes. Each box is Y inches by 6 inches by 4 inches. … If the total inner volume of all the boxes is 72 cubic inches, find the equation that relates X and Y? Answer:�Walls are 1 inch thick, reducing each dimension by 2 inches. Thus, the internal dimensions become (Y-2)x4x2 cubic inches for one box. Given the total volume for all X boxes is 72 cubic inches, the equation relating X and Y simplifies to X*(Y-2)=9. | Question - Step Necessity: …. Suppose we want to find out the total inner volume of all 3 boxes. To solve this math question,is there a way to determine the total inner volume of all 3 boxes without calculating the inner volume of one box? One possible answer:�Calculate by using: total inner volumn = total outer volume - volumn of the wall. Yes. | Question - Reasoning Format: Answer the following question with only base-2 coded values. … What is the total inner volume of all 3 boxes? Answer (in binary):�Walls are 1(bin) inch thick, reducing each dimension by 10(bin) inches. Thus, the internal dimensions become: 11x100x10=11000(bin) cubic inches. making the total volume for all 11(bin) boxes: 11x11000=1001000 (bin) cubic inches. |
| |||
Original Question | | Multimodal | |
Question: Question: Observing the kitchen scenario depicted, what might be the most likely reason for the water spills and mess on the floor? Answer: The most likely reason for the water spills and mess on the floor in the depicted kitchen scenario could be due to the pot boiling over on the stove. The visible steam and the lid slightly ajar suggest that the contents of the pot may have bubbled up and spilled out, contributing to the water on the countertop and subsequently on the floor. Additionally, the running faucet could imply that water was left running, possibly due to a hurried or distracted attempt to address the boiling over, leading to further spills. The presence of cleaning supplies like a sponge and a brush on the wet floor indicates an effort to clean up the resultant mess, which supports the idea that an accident occurred during cooking. | Question - Variable Relationship: Given the current state of the kitchen, if the presence of an overflowing pot correlates with water spills and a sponge and brush are found on the floor, what might the relationship be between these observed items and the resultant mess? Answer: The correlation could suggest that the water spills are due to the overflowing pot, and the presence of cleaning tools (sponge and brush) indicates an attempt to manage the mess. Therefore, the relationship is that the overflowing pot is the source of the water spills, and the tools are there for cleanup efforts. | Question - Step Necessity: In the depicted kitchen scenario, is it necessary to observe the overflowing pot to deduce that the water spills are a result of someone cooking? One possible answer: No, it is not strictly necessary to observe the overflowing pot to deduce that someone was cooking. Inductive reasoning from the presence of a pot on the stove, along with other cooking utensils, and the mess associated with cooking activities can lead to the conclusion that the water spills are a result of cooking activities. | Question - Reasoning Format: Analyze the depicted kitchen scene using deductive reasoning to determine the cause of the water spills. Explain the reasoning process and conclusion. Answer:�From the image, we see an overflowing pot on the stove, a running faucet, and cleaning tools on the floor. Using deductive reasoning:
|
Research Question
We choose to study those questions from the easiest data - GSM8k.
If we make it harder, we succeed!
Question Decomposition
Ontology
Picked 5 Random Questions
Curation
For variability
- manual effort is needed when creating a high quality dataset! GPT4 cannot evaluate itself.
Examples - Logic Alteration
Original:
Variable Relationship:
John has X boxes. Each box is Y inches by 6 inches by 4 inches. The walls are 1 inch thick. If the total inner volume of all the boxes is 72 cubic inches, then find the equation that relates X and Y?
Examples - Concept Analysis
Original:
Step Necessity:
John has 3 boxes. Each box is 5 inches by 6 inches by 4 inches. The walls are 1 inch thick. Suppose we want to find out the total inner volume of all 3 boxes. To solve this math question, is there a way to determine the total inner volume of all 3 boxes without calculating the inner volume of one box?
Examples - Format Change
Original:
Question:
John has X boxes. Each box is Y inches by 6 inches by 4 inches. The walls are 1
inch thick. If the total inner volume of all the boxes is 72 cubic inches, then find the equation
that relates X and Y?
Results - General
Multimodal bottlenecks
Chia et al. PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns. ACL Findings 2024.
PuzzleVQA Ontology
The Struggle of LLMs on PuzzleVQA
How about Algorithmic Puzzles?
The Story does not Change….
How about Planning?
Can Do Dataset
Planning Bottlenecks
Learning to Reason
Chia et al. Learning to Reason and Explore From Diverse Paths. EMNLP 2024
Helpfulness
Motivations
Reasoning Paths Optimization:�A Framework For Exploring And Learning From Diverse Reasoning Paths
Framework: Reasoning Paths Optimization
Main Results
Analysis
Takeaways
Improving Helpfulness with Verification
Yu et al. Reward Steering with Evolutionary Heuristics for Inference-time Alignment. Arxiv 2024.
Helpfulness
Not All Votes Count!
Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning
Motivations
Key Idea
Framework
Main Results
Analysis
Analysis
Takeaways
Inference-time Alignment
Yu et al. Reward Steering with Evolutionary Heuristics for Inference-time Alignment. Arxiv 2024.
Helpfulness
Problem and Challenge
Problem : Ensuring LLMs operate in a way that aligns with human intended goals. Involves guiding the model's behavior to be safe, reliable, and aligned with the desired outcomes of its users, avoiding harmful or biased outputs.
Challenges: Current preference optimization methods interferes with model prior LLM training and risking adherence to evolving user expectations.
Inference-Time Alignment: Aligning models without explicit weight updates to LLMs through modifying in decoding method of LLM.
Reward Steering with Evolutionary Heuristics for Inference-time Alignment
Darwin approaches inference-Time Alignment problem
as a reward guided tree search problem.
✔ Decouples exploration and exploitation of tree search
✔ Able to use on top of preference tuning method
✔ Uses an off-the-shelf reward model
✔ Outperform strong baselines on Alpacaeval2 and MT-Bench
42
Exploration and Exploitation
Exploration | Sample N | Sample N independent continuation from a given prompt |
Instruction Mutation | Prompts LLM to modify original instruction into N mutated instruction. Generate output with each instruction | |
Exploitation | Best of N | Select the highest rewarded sequences |
Reward-guided beam replacement | Periodically replace low reward generated sequences with top-k rewarded sequences |
Darwin Workflow
Darwin Improves LLM Performance during Inference
Darwin Improves SIMPO and DPO
Motivations
Model Merging
DELLA-Merging:
Reducing Interference in Model Merging through Magnitude-Based Sampling
Model Merging:
Problem:
Maintaining separate fine-tuned models for different tasks presents several limitations eg memory footprint, cost, and leverage transfer learning.
DELLA-Merging
Drop
Elect
Fuse
DELLA-Merging: MagPrune (Drop Step)
Results
Deep, Pala Tej, Rishabh Bhardwaj, and Soujanya Poria. "DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling." arXiv preprint arXiv:2406.11617 (2024).
Multimodal RAG
Overview
Data Example
Data Overview
Data Construction
Evaluation Framework
Preliminary Study
Training Framework
Results
Vision Language Action Models
LLM for Robotics
Sun, Qi, Pengfei Hong, Tej Deep Pala, Vernon Toh, U. Tan, Deepanway Ghosal, and Soujanya Poria. "Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning." arXiv preprint arXiv:2412.11974 (2024).
Helpfulness
Meet Emma-X: An Embodied Multimodal Action Model
Meet Emma-X: An Embodied Multimodal Action Model
Overview of Emma-X
Overview of Data Construction
Training and Inference with Emma-X
Emma-X is the new SOTA
Emma-X in Action
Open the microwave
Pick up an object that is a kind of vegetable
NORA
VLA Trained from Scratch
NORA
Demo (Kitchen)
With object distraction: Put the pink toy in pot
With human distraction: Put the carrot in pot
With human + object distraction: Put the pink toy in pot
NORA
Community Reception
🚀 4K+ downloads in just two weeks
Tango Model Family
Text to Audio Generation
Majumder et al. Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization. ACM MM 2024.
Helpfulness
Background of Tango
Observations on Tango Outputs
Alignment to the Rescue
Alignment Dataset
Strategy 1: prompt → four audio samples; vary the denoising steps
Alignment Dataset
Strategy 1: prompt → four audio samples; vary the denoising steps
Strategy 2: prompt → perturbed prompts → audio samples
Strategy 3: prompt → temporally perturbed prompts → audio samples
Perturbed Prompts
Alignment Dataset
Audio Alpaca Stats
Results
TangoFlux
Hung, Chia-Yu, et al. "TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization." arXiv preprint arXiv:2412.21037 (2024).
Online Iterative Training
Results
Community Response
Multimodal Representation Learning
Helpfulness
Why Multimodal? — Human Communication is Multimodal
Introduction
Major Challenge in Multimodal Analysis
Introduction
Blueprint of Multimodal Fusion
Unimodal
Representations
Joint
Multimodal
Representation
Intermediate
Representations
Audio
Visual
Text
Introduction
Hazarika, Devamanyu, Roger Zimmermann, and Soujanya Poria. "Misa: Modality-invariant and-specific representations for multimodal sentiment analysis." Proceedings of the 28th ACM international conference on multimedia. 2020.
MISA vs Rest
Introduction
Why Disentangle Features?
Introduction
Introduction
Task Setup
Overall Framework
Method
Combining Modality-invariant
and -specific Features
Method
Distributional Similarity for Invariant Features
Method
Modeling Orthogonal Modality-specific Features
Method
Preventing to Learn Trivial Representations
Method
Combining Modality-invariant and -specific Features
Method
One of the first few works showing attention can be used for multimodal fusion
The Overall Loss Function
Method
Datasets
Experiments
Baselines
Temporal Fusion:
Attention Transformer:
Graph-based:
Tensor-Fusion:
Common Representations:
Inter-utterance Joint Models:
MFN, MARN, MV-LSTM, RMFN
RAVEN, MulT
Graph-MFN
TFN, LMF, LMFN, HFFN
MCTN, ARGF, MFM
BC-LSTM, CH-FUSION, CIA, CIM-MTL, DFF-ATMF
Experiments
State of the Art
Interaction Canonical Correlation Network
Sun, Z. et al., Learning relationships between text, audio, and video via deep canonical correlation for multimodal language analysis. AAAI 2020
Contextual Memory Fusion Network
Hasan et al. UR-FUNNY: A Multimodal Language Dataset for Understanding Humor. EMNLP-IJCNLP 2019
Experiments
Low-level Features
Experiments
Language:
Audio:
Visual:
GloVe Token Embeddings or BERT Sentence Embeddings
COVAREP
(12 Mel-frequency cepstral coefficients, pitch, Voiced/Unvoiced segments, … )
Facial Action Coding System - MOSI/MOSEI
OpenFace - UR_FUNNY
Results
CMU-MOSI
MAE — Lower is better
Results
CMU-MOSEI
UR_FUNNY
Similar Trend of Results on CMU-MOSEI and UR_Funny
Results
Ablations
Analysis
t-SNE Projections
Final Loss:
indicates no similarity and difference loss
Analysis
Contribution of Learned Vectors
Analysis
Learning Curve
Improvements
Multimodal-Infomax: Overall Idea
Improvements
Multimodal-Infomax: Overall Idea
Modalities are correlated
Capture modality correlation in representations
Maximize Mutual Information
Between pair of modalities
Help learning better intermediate representations
Between modalities and fused representation
Improvements
Multimodal-Infomax: Results
Improvements
Multimodal-Infomax: Results
CMU-MOSEI
CMU-MOSI
Trustworthiness
Song et al. Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse. Arxiv 2024.
Harmlessness
LLMs Hallucinate
Do LLMs Know what they Know?
Jokes Apart: The Problem is Really Critical
Problem
Problem
Problem
Retrieval-Augmented Generation (RAG) as a Solution to Hallucination
Retrieval-Augmented Generation (RAG) as a Solution to Hallucination
LLM Groundedness
Grounded response:
Previous Works
Evaluation
Mitigation
Key Contributions
TRUST-SCORE
Assesses an LLM across multiple dimensions:
1) Grounded Refusals: is the model able to discern which questions can be answered or refused based on the provided documents?
2) Exact Match scores: For the answerable questions, is the response correct?
3) Citation recall: Are the generated statements supported by the corresponding citations?
4) Citation precision: Are the citations to the statements relevant?
TRUST-SCORE
TRUST-ALIGN
TRUST-ALIGN
Collecting Quality Questions
Collecting D’s
Augmenting (q,D) Set
Answerability Labelling
Details on Claim Document Mapping
Augmenting (q,D) Set
Obtaining r+ and r−
Obtaining r+ and r−
Effectiveness of Our Data Construction Approach
TRUST-ALIGN Boosts Trustworthiness of Models
TRUST-ALIGN Improves Models’ Refusal Capability
TRUST-ALIGN Enhances Models’ Citation Quality
Mixed Results on Exact Match Recall due to Models’ Usage of Parametric Knowledge
Models Aligned with DPO Generally Outperform those Trained with SFT
TRUST-ALIGN Generalizes across Model Families and Sizes
Importance Of Refusal Samples In Trust-Align
Improvements Generalizes on Out-of-Domain Data
Studying Parametric Knowledge Access
Quantify how many unanswerable questions were answered correctly
Revised Metrics Are Less Biased
Reduction in performance gap
Revised Metrics Are Less Biased
Revealing our model’s stronger performance as compared to baseline
Key Findings
Paper:
Codebase:
Introduction: Knowledge-Intensive Tasks
Introduction: Chain-of-Thought
Introduction: Retrieval-Augmented LLMs
Introduction: Synergizing Reasoning, Retrieval, Correction
Overview
Framework: Reasoning Generation & Domain Selection
Relevant domains: Factual (Wikidata, Wikipedia)
Framework: Iterative Retrieval and Correction
Framework: Adaptive Query Generation
Diverse Query Examples
Main Results
Analysis: Effect of Multiple Knowledge Sources
Analysis: Factuality of Rationales
Takeaways
Safety
Harmlessness
Safety issues with LLMs
Ferret: Motivation
Ferret: Methodology
Pala et al. Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique. Arxiv 2024
Ferret: Main Results
Ferret: Analysis
Language Models are Homer Simpson!
The solution is quite simple!
Bhardwaj et al. Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic. ACL 2024.
Summary of the results
Side-effects
More Generalized Version
Safety Arithmetic
Hazra et al. Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations. EMNLP 2024.
Understanding Safe Align
Solution to this equation is the PCA of
Add ICV to latent states
Start with a few exemplars
Take the latent vectors toward safe from unsafe
Simple yet Effective Solution!
Base
SFT
WM for WizardMath, LM for LlamaMath, and EC for EvolCodeAlpaca
Lower is better
WalledEval
A Comprehensive Safety Evaluation Toolkit for Large Language Models
Safety vs Refusal
(exag-safety)
Multilingual Safety
LLM Benchmarking: Numbers on the left for the first four datasets indicate the percentage of safe responses to unsafe prompts, referred to as harmful behavior (Judge: LlamaGuard 2). Nmbers on the right represent the percentage of instances where the LLM correctly chooses to refuse (for unsafe prompts) or accept (for safe prompts), referred to as refusal behavior (Judge: MCQJudge). Green, yellow, and red colors denote the highest, second highest, and lowest scores in the columns, respectively. \textbf{XSTest} (Mutated) refers to XSTestm}.
Judge Benchmarking
Thank you!