Data & Intelligent Systems (DAIS)
Professors Abdu Alawini, Arindam Banerjee, George Chacko, Kevin Chang, Pablo Robles-Granda, Jiawei Han, Jingrui He, Heng Ji, Daniel Kang, Yongjoo Park,
Jimeng Sun, Hari Sundaram, Hanghang Tong, Jiaxuan You, ChengXiang Zhai
Who are we? (incomplete list of DAIS faculty)
Kevin Chang
Jiawei Han
ChengXiang Zhai
Knowledge Acquisition, Integration, & Mining
Data Mining
Machine Learning & Health Informatics
Hari Sundaram
Social Net. Analysis, Crowdsourcing
Hanghang Tong
Data Mining
Network Mining
Yongjoo Park
Data-intensive Systems
Learning from Data Heterogeneity
Jingrui He
Machine Learning
Spatio-temporal Data
Arindam Banerjee
Jimeng Sun
AI for healthcare
Heng Ji
Multi-Source Information Extraction
George Chacko
Scientometrics
Daniel Kang
Big Data & Machine Learning Systems
Pablo R. Granda
Intelligent Information Systems
Abdu Alawani
Educational Data Mining
CS Education
Jiaxuan You
Graph-powered data and AI
+ Lots of
students!
What are we working on? Data to Intelligence (“Big Data”)
Scalability
Intelligence
Application impact
Health/Medical/Biology
Education
Productivity (Web, email, …)
Decision making (government, business, personal)
Search
Data/Information Access
Information analysis & Data mining
Browsing
Recommendation
Decision/Task support
Intelligent information agent
Gigabytes
Terabytes
Petabytes
Storage
Artificial Intelligence
Statistics
Systems & Networking
Parallel Computing
Human-Computer Interaction, Graphics
Bioinformatics
Theory & Algorithms
Big Data-Enabled Artificial Intelligence
4
Big
Data
Training Data
(Supervised) Machine Learning
Autonomous AI: Intelligent systems to replace humans
Observations of World
(Unsupervised) Machine Learning
Generative Models, Data Mining
Intelligence
Human
Assistive AI: Intelligent systems to assist humans
Autonomous AI vs. Assistive AI
5
Simple tasks, Big training data available, Limited domain
Intelligence(Machine) ≤ Intelligence (Human) [Upper Bound]
Assistive AI
Autonomous AI
Any domain, All kinds of data
“Data Scope” to enhance human perception, Complex tasks
Intelligence (Machine + Human) > Intelligence(Human) [Lower Bound]
Tasks that humans can’t (easily) do
(Augmentation of human intelligence)
Human in the Loop
Strong presence in multiple research communities: �Data Mining, Information Retrieval, Databases, Web, …
ACM SIGKDD
ACM SIGIR
ACM SIGMOD
Data Source: ACM Digital Library: https://dl.acm.org/sigs, Retrieved September 28, 2025
ACM SIGWEB
Name | Count |
Tsinghua Univ. | 683 |
Univ. of Amsterdam | 611 |
UIUC | 573 |
Name | Count |
Microsoft Research | 634 |
Microsoft Corporation | 605 |
Tsinghua Univ. | 587 |
UIUC | 574 |
Name | Count |
Tsinghua Univ. | 675 |
UIUC | 567 |
Name | Count |
Tsinghua | 492 |
Carnegie Mellon Univ. | 398 |
Google LLC | 379 |
UIUC | 352 |
Come and Join a Strong Body of DAIS Students!�
Want to know more? Visit the DAIS website at
Data and Intelligent Systems (DAIS)
Abdu Alawini: Educational Data Mining and AI in Education
Current Interests:
Echelon: An AI Tool for Clustering SQL Queries
TriQL: A tool for learning relational, graph and document-oriented database programming
A database approach to identifying collaborative learning behaviors
Arindam Banerjee: Spatio-Temporal Data Analysis
Current Interests:
Ecology: Modeling plant traits, biodiversity
Climate Science: Predicting climate variables
Deep learning geometry
- Sparse, low-rank gradients
- Optimization, Generalization
Deep generative models
- Normalizing flows
- Variational inference
Sequential Decision Making
- Contextual bandits
- Smoothed analysis
George Chacko: Scientometrics and Networks
Current Interests:
The Emergence of Scientific Ideas
Center-periphery Structure in Communities
Kevin Chang: Knowledge acquisition and information integration over structured and unstructured data
General Problems:
�Techniques: large language models, natural language processing, information retrieval, data mining, machine learning, and large-scale data analytics.
Current Projects:
�
Pablo Robles-Granda: Machine Learning and Health Informatics
General Problems:
�Techniques: Data Mining/ML, Graph Mining, Graphical Models, Nonlinear Dynamical Systems
�
Current Interests:
Jiawei Han: LLM for Text Mining and Science Applications
LLM: Knowledge & Guidance
Scientific Text Corpora
Knowledge & Intelligence
General and domain-specific KB
Theme-focused Information Retrieval
Knowledge Graphs
PLMs + LLMs
Selected and Distilled Documents & Passages
Structured Passages
Theme-based Information Extraction
& Knowledge Graph Construction
Feedback& Refinement
Feedback& Refinement
Jingrui He: Learning from Data Heterogeneity
Evaluation of Agentic Multimodal RAG
Heterogeneous MAS
Transformer Copilot for LLM Fine-tuning
Climate and Sustainability, National Security, Agriculture, Healthcare, etc.
Heng Ji: Multimodal Knowledge Extraction / �Knowledgeable Foundation Models / Science-Inspired AI�
Yongjoo Park: Systems for Data-Intensive Artificial Intelligence (AI)
Exploratory AI
Kishu: World's First Undoable Notebook
● Git-like versioning for exploratory AI
● Not just code: checkpoints code+data
LLM for Data (Retrieval-Augmented Gen)
LazyAttention for Efficient RAG
● enables position-oblivious caching, for the first time
● avoids repetitive processing (inside transformer blocks)
● allows extremely fast response time (time-to-first-token)
…
large
database
LLM with
LazyAttention
query
answer
any documents in any order
Also, building new vector databases for
● extremely parallel indexing/retrieval
● new forms of approximate nearest neighbor
No processing of these docs → fast response
Daniel Kang: ML + Data systems
We are building AI agents to solve data problems
Jimeng Sun
Hari Sundaram: Social Network Dynamics
http://sundaram.cs.illinois.edu
How can social networks influence behavior at large scale?
sustainability
public health
exercising
traffic
In my group we develop theory, design algorithms, build systems and run experiments
algorithmic game theory, equilibria
large-scale network analysis
message synthesis, spatial codes
Hanghang Tong: Network Learning & Mining
Jiaxuan You: Graph AI and AI Agents
Data
Model
Human
Multi-modal data as graphs
Knowledge sharing with graphs
Graph data in AI
Foundational models
AI agent
Democratize AI
Human AI Collaboration
Applications
ChengXiang (“Cheng”) Zhai: Intelligent Information Systems
Medical decision support
Personalized nutrition
Affordable personalized learning at scale
Acceleration of
scientific discovery
Especially
Big Text Data
Intelligent Task Agent
2. Human-Like NLP
3. Human-AI collaboration
4. Computational user modeling
1. Big data analytics
5. Augmented Intelligence