AI Safety and Trustworthy AI
2024.11.21
Chon, Kilnam
Closed-Door Meeting on Trustworthy AI, 2024 WIC 2024.11.1rev11.22
“The AI Safety Meeting will address the following core topics”
2
“What are the challenges humanity faces in the application of AI?”
“What constitutes safe and trustworthy AI?”
“How can nations safely deploy advanced AI technologies to promote their economic and social development?””
Future of Life Institute (FLI)
3
AI Safety Meetings
2023.11 Bletchley, UK (with Bletchley Declaration)
2024.03 (Second) International Dialogue on AI Safety, Beijing
2024.05 AI Seoul Summit (co-hosted by UK and South Korea)
2024.11 Inaugural International AI Safety Institute Meeting, � San Francisco
2025.02 AI Action Summit, France
4
AI Standardizations
Trustworthy AI (under AI for Goods)
42001:2023 AI – Management System
5
Remarks/Issues
6
References
AI Safety Institute, Wikipedia, 2024.
State of AI safety in China, Concordia AI, 2023.
Bletchley Declaration, 2023.11.
ChinAI Newsletter, 2010s~�Chinese AISI Counterparts, Inst. For AI Policy & Strategy, 2024.10.30.�D. Hassabis, Accelerating science discovery, 2024.11.
(Second) International Dialogue on AI Safety, Beijing, 2024.3.10-11.
AI Safety Summit Talks with Yoshua Bengio, YouTube, 2024.5.
Int’l Scientific Report on Safety of Advanced AI, AI Seoul (Safety) Summit, 2024.5.�Stuart Russell, What if we succeed?, 2024.
USG announced global cooperation plan among AI safety organizations, 2024.5.
US Vision of AI Safety, Elizabeth Kelly, Director of AI Safety Institute, 2024.
USG, Framework to advance AI gov. & risk management in national security, 2024.
Yi Zeng, Keynote Speech, Closed-Door Meeting on Trustworthy AI, Wuzhen, 2024.
7
Appendix: AI Safety – Definition
AI safety is an interdisciplinary field focused on preventing accidents, misuse, or other harmful consequences arising from artificial intelligence (AI) systems. It encompasses machine ethics and AI alignment, which aim to ensure AI systems are moral and beneficial, as well as monitoring AI systems for risks and enhancing their reliability. The field is particularly concerned with existential risks posed by advanced AI models.
8
Appendix: Trustworthy AI
A work program, initiated under AI for Good programme. [Wikipedia 2024]
9
Appendix: Chinese AI Safety Institutes
Chinese Academy for Information Communication Technology (CAICT), MIIT.
Shanghai AI Laboratory
Beijing Academy of AI.
2. Partial Coverage (Standard for TC260, Int’l Cooperation for I-AIIG)
(TC260)
(Institute for AI International Governance (I-AIIG)
Source: Institute for AI Policy and Strategy, Chinese AISI Counterparts, 2024.10.30.
Remark: There is the extensive network of AI safety institutes in China (Yi Zeng)
10
Appendix: Bletchley Declaration (excerpt)��Countries agreed substantial risks from potential misuse or unintended issues of control of frontier AI, with particular concern caused by cybersecurity, biotechnology and disinformation risks. The Declaration sets out agreement that there is “potential for serious, even catastrophic, harm, either deliberate or unintentional, stemming from the most significant capabilities of AI models.” Countries also noted the risks beyond frontier AI, including bias and privacy.
11
Appendix: Stuart Russell
12