1 of 12

AI Safety and Trustworthy AI

2024.11.21

Chon, Kilnam

Closed-Door Meeting on Trustworthy AI, 2024 WIC 2024.11.1rev11.22

2 of 12

“The AI Safety Meeting will address the following core topics”

2

“What are the challenges humanity faces in the application of AI?”

“What constitutes safe and trustworthy AI?”

“How can nations safely deploy advanced AI technologies to promote their economic and social development?””

3 of 12

Future of Life Institute (FLI)

  • AI Safety Research Project, 2015~
  • Asilomar Conference on Beneficial AI in the 2017~2021.
  • AI Principles, 2022.

3

4 of 12

AI Safety Meetings

2023.11 Bletchley, UK (with Bletchley Declaration)

2024.03 (Second) International Dialogue on AI Safety, Beijing

2024.05 AI Seoul Summit (co-hosted by UK and South Korea)

2024.11 Inaugural International AI Safety Institute Meeting, � San Francisco

2025.02 AI Action Summit, France

4

5 of 12

AI Standardizations

  • ITU

Trustworthy AI (under AI for Goods)

  • ISO

42001:2023 AI – Management System

  • CEN/CENELEC

5

6 of 12

Remarks/Issues

  • Multistakeholder vs Multilateral (vs Unilateral)

  • Global South (~G20) vs Global North (G7)

  • AGI – How to handle AGI under “AI Safety?

6

7 of 12

References

AI Safety Institute, Wikipedia, 2024.

State of AI safety in China, Concordia AI, 2023.

Bletchley Declaration, 2023.11.

ChinAI Newsletter, 2010s~�Chinese AISI Counterparts, Inst. For AI Policy & Strategy, 2024.10.30.�D. Hassabis, Accelerating science discovery, 2024.11.

(Second) International Dialogue on AI Safety, Beijing, 2024.3.10-11.

AI Safety Summit Talks with Yoshua Bengio, YouTube, 2024.5.

Int’l Scientific Report on Safety of Advanced AI, AI Seoul (Safety) Summit, 2024.5.�Stuart Russell, What if we succeed?, 2024.

USG announced global cooperation plan among AI safety organizations, 2024.5.

US Vision of AI Safety, Elizabeth Kelly, Director of AI Safety Institute, 2024.

USG, Framework to advance AI gov. & risk management in national security, 2024.

Yi Zeng, Keynote Speech, Closed-Door Meeting on Trustworthy AI, Wuzhen, 2024.

7

8 of 12

Appendix: AI Safety – Definition

AI safety is an interdisciplinary field focused on preventing accidents, misuse, or other harmful consequences arising from artificial intelligence (AI) systems. It encompasses machine ethics and AI alignment, which aim to ensure AI systems are moral and beneficial, as well as monitoring AI systems for risks and enhancing their reliability. The field is particularly concerned with existential risks posed by advanced AI models.

8

9 of 12

Appendix: Trustworthy AI

  • Definition: Trustworthy AI refers AI systems designed and deployed to be transparent, robust and respectful of data privacy. [Wikipedia 2024]

  • ITU Standardization on Trustworthy AI

A work program, initiated under AI for Good programme. [Wikipedia 2024]

9

10 of 12

Appendix: Chinese AI Safety Institutes

  1. Full Coverage (Tech Research & Evaluation, Standards, Int’l Cooperation)

Chinese Academy for Information Communication Technology (CAICT), MIIT.

Shanghai AI Laboratory

Beijing Academy of AI.

2. Partial Coverage (Standard for TC260, Int’l Cooperation for I-AIIG)

(TC260)

(Institute for AI International Governance (I-AIIG)

Source: Institute for AI Policy and Strategy, Chinese AISI Counterparts, 2024.10.30.

Remark: There is the extensive network of AI safety institutes in China (Yi Zeng)

10

11 of 12

Appendix: Bletchley Declaration (excerpt)��Countries agreed substantial risks from potential misuse or unintended issues of control of frontier AI, with particular concern caused by cybersecurity, biotechnology and disinformation risks. The Declaration sets out agreement that there is “potential for serious, even catastrophic, harm, either deliberate or unintentional, stemming from the most significant capabilities of AI models.” Countries also noted the risks beyond frontier AI, including bias and privacy.

11

12 of 12

Appendix: Stuart Russell

  • Make Safe AI vs Make AI Safe

12