�
Jiawei Han
Siebel School of Computing and Data Science
University of Illinois at Urbana-Champaign
August 3, 2025
1
Reasoning with Structures for Large Language Models
1
Outline
2
Why Is Theme-Specific Knowledge Graph a Critical Structure?
3
Empowering LLMs―Prompting, Fine-Tuning, RAG & Structuring
Figures adapted from Y. Gao et al, RAG Survey. arXiv:2312.10997
O. Ovadia, et al (2023), “Fine-tuning or retrieval? comparing knowledge injection in LLMs,” arXiv:2312.05934
Retrieval and Structuring?
Fine-tuning +
Retrieving + Structuring?
4
A Retrieving-Structuring-Reasoning Framework
Text & Multimodal Data
General KB
Query/task-guided Theme-focused Information Retrieval
Causal Graph
Selected, Distilled, Relevant Documents
Task-specific Structure Mining
& Graph Construction
Knowledge with Quality Reasoning
User Query/Task
LLMs
Event Structure
Multiple Theme- or Function- Specific Knowledge Graphs
Aspect Graph
Task- and Structure-based Augmentation for LLM Generation
Retrieving
Structuring
Reasoning
5
Outline
6
StructRAG: Motivation and Methodology
Li et al., "StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information”, ICLR 2025.
7
StructRAG: Experiments and Analyses
8
Outline
9
Do We Need Knowledge Graphs for LLM Reasoning?
Pengcheng Jiang, Cao Xiao, Minhao Jiang, Parminder Bhatia, Taha Kass-Hout, Jimeng Sun, Jiawei Han, "Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval", Int. Conf. on Learning Representation (ICLR’2025)
10
Retrieval and Structuring for LLM-Empowered Reasoning
Pengcheng Jiang, Cao Xiao, Minhao Jiang, Parminder Bhatia, Taha Kass-Hout, Jimeng Sun, Jiawei Han, "Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval", Int. Conf. on Learning Representation (ICLR’2025)
11
KARE: The General Framework
12
Step 1: Medical Concept Knowledge Graph Construction and Indexing (1)
13
Step 2: Patient Context Construction and Augmentation
Patient Base Context
Ensure the augmented context includes the most relevant and diverse info from the KG, tailored to the patient’s specific conditions and the prediction task
14
Step 3: Reasoning-Enhanced Precise Healthcare Prediction
15
Experiment Setting: Task, Data and Metrics
16
Performance Comparison on MIMIC-III Dataset
Results are averaged by multiple runs. asterisk (∗): important for handling imbalanced datasets.
17
Outline
18
RepoGraph: Background and Motivation
Ouyang et al., "RepoGraph: Enhancing AI Software Engineering with Repository-level Coding Graph", ICLR 2025
A perfect testbed for RAS in engineering domain!
19
RepoGraph: Methodology
20
RepoGraph: Experiments and Analyses
Recall improves at all granularities; the improvement at finer granularity is relatively smaller.
21
Outline
22
Why SARG: Structure-Augmented Reasoning Generation?
Jash Parekh, P. Jiang, J Han, "Structured Multi-Hop Augmented Reasoning Generation“, arXiv:2508
23
The SARG Framework
From a domain-specific dataset, an LLM extracts zero-shot causal triples, which are structured into a DAG. Given a query, the system identifies semantic matches, performs forward or backward traversal to extract causal chains and generates a justification-based answer using an LLM
24
Zero-Shot Extraction of Causal Triples
Zero-Shot Causal Triple Extraction
25
SARG Methodology: Graph-Based Multi-Hop Reasoning
26
SARG: Chain Ranking and LLM-Guided Answer Generation
27
LLM-Powered Output Generation with Justification
SARG’s key advantages:
28
Performance Comparison: SARG vs. RAG vs. Zero-Shot
Data Statistics: BP (Bitcoin Price) and GD (Gaucher Disease)
Automatic Evaluation: SARG vs. RAG vs. Zero-Shot
29
Evaluation of Summarization Quality by LLM and Human
Accuracy on a HotPotQA hard-100 subset
Human evaluation results showing preferred responses
across BP and GD datasets. #s indicate votes received out of total questions evaluated
30
Outline
31
Aspect-based Reasoning Structure Extraction
Priyanka Kargupta, Runchu Tian, Jiawei Han, "Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims", ACL 2025
32
Framework of ClaimSpect: Hierarchical Aspect Discovery
33
ClaimSPECT: Performance Comparison
Incon (Inconsistent): when the position of the methods are flipped in prompt, the opposite conclusion is drawn
Dataset statistics in experiments
Promt for generating nuanced claims
Task: Generate 10 nuanced and diverse claims based on this corpus. The claims should adhere to the following criteria:
Diversity: The claims should be sufficiently varied
Complexity: The claims should be complex and controversial (and not necessarily true) …
Research Feasibility: The claims should not be too specific and should pertain to topics ...
Concision: The claims should be concise and focused in one short sentence
Completeness: The claims should be complete and not require additional context to understand.
Output: Provide the claims as a list.
34
ClaimSPECT: Case Study
35
Outline
36
Synergizing Unsupervised Episode Detection with LLMs
Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, Jiawei Han, "Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events", ACL 2025
37
Challenges of Mining Unsupervised Episodes with LLM
38
EpiMine: Unsupervised Episode Detection
39
EpiMine: Experiments and Performance Comparison
EpiMine: Data Statistics
Results averaged across each theme (the mean # of episodes that EpiMine identifies per theme is in parenthesis). Results are computed on each key event corpus using the top-5 documents for each detected episode. We run it 10 times and report the average of each measure.
40
EpiMine: Case Study
Gold and detected episodes (a max. of five are included for brevity) for the “2019 Hong Kong Legislative Protests” key event
41
Outline
42
Looking forward: Graph Mining & Structure-Guided LLM Generation
Text & Multimodal Data
General KB
Query/task-guided Theme-focused Information Retrieval
Causal Graph
Selected, Distilled, Relevant Documents
Task-specific Structure Mining
& Graph Construction
Knowledge with Quality Reasoning
User Query/Task
LLMs
Event Structure
Multiple Theme- or Function- Specific Knowledge Graphs
Aspect Graph
Task- and Structure-based Augmentation for LLM Generation
Retrieving
Structuring
Reasoning
Data Mining could be an important step for LLM!!
43
References for Part 4: “Reasoning with Structures for LLMs
44