1 of 43

CSCI-SHU 205: Topics in Computer Science

Human-AI Alignment

Hua Shen

Course Website: https://hua-shen.org/src/course_bialign.html

2025-09-01

Lecture 1

2 of 43

Welcome to BiAlign course👏 !

Hello from Your Instructor!

Hua Shen

Assistant Professor of Computer Science

huashen@nyu.edu | huashen218

https://hua-shen.org/

3 of 43

Outline

  1. What is Human-AI Alignment? (20 min)
  2. Why Human-AI Alignment? What if Not? (15 min)
  3. Course Overview and Logistics (20 min)
  4. Discussion: Know more about You 🧡 ! (20min)

By joining today’s class –

· decide whether this course fits you well;

· learn an overview and motivation of HAI-Alignment;

· share what you want to get from this course.

4 of 43

Outline

  • What is Human-AI Alignment? (20 min)
  • Why Human-AI Alignment? What if Not? (15min)
  • Course Overview and Logistics (20 min)
  • Discussion: Know more about You! (20min)

5 of 43

Share Your Thoughts 🙌

What’s your understanding of “Human-AI Alignment”

6 of 43

AI systems are deeply integrated into our lives …

Autonomous Cars

Writing Assistant

Image Generation

7 of 43

AI ethics are NOT fully aligned with human values…

Crashes with Autonomous Cars

Writing Assistant

Generates Misinformation

Stereotypical Biases

In Image Generation

8 of 43

Challenges in state-of-the-art research

Human-centered and AI-centered research & skills are largely divided!

“Find” AI Problems for Humans

“Address” Problems in AI

Human-AI Alignment!

9 of 43

What is Human-AI Alignment?

10 of 43

What is Human-AI Alignment?

Towards Bidirectional Human-AI Alignment

11 of 43

What is Human-AI Alignment?

Towards Bidirectional Human-AI Alignment

12 of 43

Objective of Human-AI Alignment

Maximizing Capabilities AND Minimizing Risks in Human–AI Co-Evolution

13 of 43

Scope of Human-AI Alignment

Human-AI Alignment involves:

  • Capture human values and feedback for LLMs / AI Agents
  • Develop LLMs / AI Agents to integrate human values & feedback
  • Interpret models to humans for human-AI collaboration
  • Social impact of LLMs/AI Agents for trustworthy and safe AI
  • ……

14 of 43

Who is the human in “Human-AI Alignment”?

15 of 43

Efforts towards Human–AI Alignment so far…

16 of 43

Efforts towards Human–AI Alignment so far…

2025 Tutorial on Human-AI Alignment

@Dec,2025

Hua Shen

Instructors:

Mitchell Gordon

Adam Tauman Kalai

Panelists:

Yoshua Bengio

Dawn Song

Monojit Choudhury

Hannah Kirk

Eric Gilbert

17 of 43

Outline

  • What is Human-AI Alignment? (20 min)
  • Why Human-AI Alignment? What if Not? (15min)
  • Course Overview and Logistics (20 min)
  • Discussion: Know more about You! (20min)

18 of 43

State-of-the-art Research

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., ... & Lowe, R. Training language models to follow instructions with human feedback. NeurIPS 2022.

Human feedback is primarily rating and ranking…

19 of 43

State-of-the-art Research

"Constitutional ai: Harmlessness from ai feedback." arXiv:2212.08073.

"Red teaming language models with language models." arXiv:2202.03286.

Responsible AI work commonly involves minimal human participation

20 of 43

Missing Diverse Human Participation…

Ion Stoica, Keynote: Reliability: An AI Challenge. Agentic AI Summit 2025. https://www.youtube.com/watch?v=c39fJ2WAj6A (1:27:07)

— Ion Stoica

Co-Founder, Databricks & Anyscale

Professor of UC Berkeley

@Agentic AI Summit 2025

21 of 43

22 of 43

Without humans in the loop of AI development & deployment…

23 of 43

How?

24 of 43

This BiAlign course equips you with fundamental knowledge, technical skills, and practical project experience to advance human-AI alignment in future research.

25 of 43

Outline

  • What is Human-AI Alignment? (20 min)
  • Why Human-AI Alignment? What if Not? (15 min)
  • Course Overview and Logistics (20 min)
  • Discussion: Know more about You! (20min)

26 of 43

Course Overview

I. Foundations

II. Methods

III. Practice

III. Practice

I. Foundations (Week 1-2)

  • Overview:
    • Evolving Challenges of AI Alignment and Human's Role
  • Values & Morals in LLMs:
    • Theories, Evaluation and Social Impacts

27 of 43

Course Overview

I. Foundations

II. Methods

III. Practice

III. Practice

II. Methods (Week 3-9)

  • Data is Gold:
    • Co-Annotation, Styes and Validation
  • Post-Training for Alignment:
    • Prompt, SFT, RLHF and Interactive Alignment
  • Evaluation and Ecosystem:
    • Automatic and Human Evaluation & Platform

28 of 43

Course Overview

I. Foundations

II. Methods

III. Practice

III. Practice

III. Practice (Week 10-15)

  • Human-AI Interaction:
    • Design, Evaluation & Use Cases
  • LLM Interpretability:
    • Mechanistic & Human - Centered Interpretability
  • Risks, Trust and Safety:
    • Benchmarks & Human - Centered Auditing

29 of 43

Course Goals:

  • Gain Foundational Knowledge on Human-AI Alignment
  • Familiarize Cutting-edge Research
  • Harness Hands-on Technical Skills
  • Practical Projects

30 of 43

Mapping Goals to Course Activities

Course Goals

Course Activities

1. Gain Foundational Knowledge

Lectures (By the instructor)

2. Familiarize Cutting-edge Research

Paper Presentations (Lead by You👏)

3. Harness Hands-on Skills

Two Assignments

  • Assignment 1: AI-Focused
  • Assignment 2: HCI-Focused

4. Practical Projects

One Final Project

31 of 43

Course Activities

Paper Presentation: 20%

  • Earlier presentation dates receive higher scores.

Project: 50%

  • Proposal: 10%
  • Midway Report: 10%
  • Midway Presentation: 5%
  • Final Submission: 20%
  • Final Presentation: 5%

Assignments: 20%

  • Assignment 1: 10%
  • Assignment 2: 10%

Participation: 10%

  • General participation: 5%
  • Question contribution: 5%

32 of 43

Clarification on Several Course Activities

Paper Presentation

  • Sign-up Link: Sign-up Google Sheet;
  • Team: 1-2 students per group;
  • Papers: select at least one paper from reading materials to present;
  • Earlier presentation dates potentially receive higher scores.

Paper Title

Name(s)

33 of 43

Clarification on Several Course Activities

Course Assignments

Assignment 1 (10%):

Assignment 2 (10%):

LLM Post-Training Alignment (AI)

  • Data: Human value related datasets
  • Alignment: Prompting, Finetuning, Instruction-tuning, RLHF/DPO (optional)

Human-LLM Interactive Alignment (HCI)

  • Frontend UI Design: collect human value data
  • Backend LLM Server: incorporate human value to customize AI

34 of 43

Clarification on Several Course Activities

Projects

  • Team: 1-2 students per group;
  • Scopes (Beyond below Examples):
    • Novel human-LLM co-annotated datasets & benchmark
    • Improving human-Agentic AI interaction workflow;
    • Novel evaluation & oversight on alignment and interaction
    • (Human-centered) interpretability and Agentic AI
    • Novel solutions of LLMs’/agents’ risks and safety issues

35 of 43

Course Policy

  • Please get familiar with NYU Shanghai’s Statement on Academic Integrity (in the Undergraduate Bulletin).
  • Generative AI Policies:
    • You could use Generative AI ethically for in this course;
    • Generative AI can be used to support your research and coding;
    • But - the creativity and novel ideas should be your own.
  • Late policy:
    • Each student could have one free submission late day;
    • Final project papers cannot be turned in late under any circumstances
  • You can always let me know if you have any questions or need support.

36 of 43

Computing Resource

— Generative AI Tools and Services in NYU Shanghai

Service

How to Access

Collect data?

Commercial

  • OpenAI.com (account needed)
  • Microsoft Bing (with built-in ChatGPT-5 functionality)
  • Public AI Tools (ChatGPT, Claude, Bard) - Direct Platform access; strictly prohibited for any NYU-related information

Yes (personal use only)

Institutional Licence

@NYU IT

  • Google Gemini & NotebookLLM - Available through NYU IT with NetID

No (NYU wide license)

Private By Request

  • Private Generative AI Pilot (OpenAI ChatGPT) - Submit project proposal via NYU IT; 3-5 days approval;

No

37 of 43

Outline

  • What is Human-AI Alignment? (20 min)
  • Why Human-AI Alignment? What if Not? (15 min)
  • Course Overview and Logistics (20 min)
  • Discussion: Know more about You 🧡 ! (20min)

38 of 43

Love to know more about you!

What’s your experience + expectation on this course

39 of 43

Share Your Experience & Insights

  1. A quick introduction of yourself
  2. Why are you interested in this course?
  3. What do you want to gain from this course?
  4. What do you need (or want to see) in this course?

40 of 43

Your feedback & discussion is always welcome!

41 of 43

Summary

What is Human-AI Alignment? (20 min)

Course Overview and Logistics (15 min)

Why Human-AI Alignment? What if Not? (20min)

Discussion: Know more about You! (20min)

Next class –

Overview: Evolving Challenges of AI Alignment and Human's Role

42 of 43

Assignment #0

Due: 11:59 PM, Sep 7, 2025 (Sun).

(China Standard Time)

  • Sign-up Link: Sign-up Google Sheet;
  • Reading Materials: Project Website;

Reading Materials

43 of 43

Thank You 💛

See you on Wednesday!