The State of AI Safety in China
Spring 2024 Report
Published May 14, 2024
2
Executive Summary (1)
Executive Summary
3
Executive Summary (II)
Executive Summary
4
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
5
Thanks to positive feedback on our first report and rapid AI developments since October 2023, we have decided to issue an update!
Introduction and Scope
6
Our report focuses on “frontier AI risks.”
Narrow AI systems with dangerous capabilities
E.g. AI models used for bioengineering
Highly capable general-purpose foundation models
E.g. GPT-3.5, Llama 2, as well as more advanced models
Low risk narrow systems
E.g. AlphaGo, AlphaFold
Sub-frontier foundation models
E.g. GPT-3
Narrow AI
General AI
Less ← Generality → More
Less ← Potential Harm → More
Scope of the report
Introduction and Scope
7
Our report focuses on AI safety rather than AI security.
Introduction and Scope
8
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
9
Overview of key developments since October 2023
Technical Safety Research
10
Methodology for selecting Chinese Frontier AI Safety Papers
Technical Safety Research
11
Methodology for identifying Key Chinese AI Safety-relevant Research Groups
Technical Safety Research
12
2.1 Overall trends: Relevance and quantity of frontier safety research has increased substantially compared to mid-2023, and the most popular research direction has been alignment.
2.2 Key research groups
2.3 Notable technical papers
Technical Safety Research
13
Over the past 6 months, there has been an average of nearly 15 frontier AI safety papers per month, compared to an average of 6 per month for the preceding 7 months – a substantial increase.
Technical Safety Research
14
Chinese researchers are showing interest in various frontier AI safety research directions, with alignment being the most represented. However, research on the interpretability of frontier models is relatively lacking.
Technical Safety Research
0.7%
Frontier AI Safety Research Directions
15
2.1 Overall trends
2.2 Key research groups: The majority of key research groups we identified, 8 out of 11, have leading safety researchers with at least 1 of 2 major research honors. This suggests that the groups spearheading frontier safety research are likely producing high-quality work.
2.3 Notable technical papers
Technical Safety Research
16
We identified 11 relevant groups, a decline from the 13 on our October 2023 list due to a much higher bar for inclusion – 3 frontier safety papers over the past year.
Technical Safety Research
17
These research groups are concentrated mostly in universities, but there are some examples in private industry and state-backed labs.
Technical Safety Research
18
The AI safety research groups are located primarily in China’s AI hubs of Beijing and Shanghai.
Technical Safety Research
*ByteDance Research is not included on this graph, as researchers based in the US conducted the relevant AI safety research.
19
8 out of the 11 labs have at least 1 safety paper anchor author who has either received a top conference best paper award nomination, or was ranked top 2% in their field by Stanford, or both.
Technical Safety Research
Stanford top 2%
Conference best paper
ByteDance
Responsible AI
Fudan NLP
MSRA
PKU CAISG / PAIR
SHLAB
SHJT GAIR
THUNLP
Tsinghua Foundation Model Research Center / CoAI
20
2.1 Overall trends
2.2 Key research groups
2.3 Notable technical papers: The following slides in this subsection dive into key technical papers, nearly all from the past 6 months. Readers may also choose to skip forward to the “International Governance” section instead.
Technical Safety Research
21
Alignment: There is now some work on addressing broader social questions around alignment, as well as some preliminary attempts towards scalable oversight.
Technical Safety Research
22
Alignment: Several research groups have begun exploring large language model (LLM) unlearning approaches.
Technical Safety Research
23
Alignment: Chinese researchers are interested in improving Constitutional AI approaches.
Technical Safety Research
24
Alignment work has extended to how human values are understood across languages.
Technical Safety Research
25
Alignment of multi-agent systems is also the subject of multiple papers.
Technical Safety Research
26
Robustness work includes backdoor attacks…
Technical Safety Research
27
Robustness to adversarial multimodal attacks …
Technical Safety Research
28
Robustness to attacks via coding …
Technical Safety Research
29
and Robustness of multi-agent systems to jailbreaking
Technical Safety Research
30
Systemic safety research includes work on biological and chemical risks.
Technical Safety Research
31
Systemic safety: there have also been many works on watermarking and deepfake detection.
Technical Safety Research
32
Systemic safety: Issues in tool learning safety have also been explored with Fudan NLP’s ToolSword framework.
Technical Safety Research
33
For Monitoring (evaluations), benchmarks from SHLAB and TJUNLP test for a number of frontier safety misuse cases.
Technical Safety Research
34
Monitoring (evaluations) also includes new work on evaluating LLM value alignment.
Technical Safety Research
35
Monitoring (interpretability) research is a much smaller proportion of papers than other research directions, in part because much of this work focuses on models smaller than frontier large models.
Technical Safety Research
36
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
37
Overview of key developments since October 2023
International Governance
38
3.1 Multilateral Governance: In multilateral fora, China signed the Bletchley Declaration and co-sponsored the first UNGA resolution on AI, demonstrating points of common ground on certain AI safety and governance issues.
3.2 Global South
3.3 Bilateral Governance
3.4 “Track 1.5” and “Track 2” dialogues
International Governance
39
China’s participation in the UK AI Safety Summit and signing of the Bletchley Declaration showed that international dialogue on AI safety between China and the West can yield meaningful results.
International Governance
40
China joined 120+ countries in co-sponsoring a landmark UNGA resolution on AI which had been initiated by the US.
International Governance
41
China has re-emphasized interest in multilateral AI governance since announcing the Global AI Governance Initiative and signing the Bletchley Declaration.
International Governance
42
Chinese companies joined international counterparts in drafting 2 international standards on AI safety and security.
International Governance
43
3.1 Multilateral Governance
3.2 Global South: In addition to these multilateral efforts, China also announced new efforts to expand AI cooperation with African countries.
3.3 Bilateral Governance
3.4 “Track 1.5” and “Track 2” dialogues
International Governance
44
China announced new projects on AI at the 2024 China–Africa Internet Development and Cooperation Forum, focusing on coordinating with Africa on global governance, with only a brief reference to AI safety topics.
International Governance
45
3.1 Multilateral Governance
3.2 Global South
3.3 Bilateral Governance: China issued a joint statement on AI with France and is establishing a new AI-focused dialogue with the US.
3.4 “Track 1.5” and “Track 2” dialogues
International Governance
46
The Sino-French joint statement indicates both governments are prioritizing AI governance, increasing chances for further dialogue on AI and deeper Chinese participation in the 2025 French AI summit.
International Governance
47
Details about the China-US dialogue remain sparse, but there are hints that frontier AI safety will be on the agenda.
International Governance
48
3.1 Multilateral Governance
3.2 Global South
3.3 Bilateral Governance
3.4 “Track 1.5” and “Track 2” dialogues: Track 1.5 and 2 dialogues between China and the West have increased over the last year, but there remain some gaps in the landscape.
International Governance
49
Frontier AI discussions are a growing but still minor fraction of overall dialogues, and some key stakeholder groups are underrepresented at present.
International Governance
50
One frontier AI safety Track 2 dialogue between top Chinese and Western AI scientific and governance experts produced substantive joint declarations on AI safety.
International Governance
51
Some strategies Concordia AI has previously proposed for collaboration with China on international AI governance:
International Governance
52
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
53
Overview of key developments since October 2023
Domestic Governance
54
4.1 Overarching national guidance: Top national leaders have not publicly prioritized AI safety any further, though experts have included provisions relevant to frontier safety in their drafts of the national AI law.
4.2 National regulations and policies
4.3 Science and technology ethics system
4.4 Voluntary standards
4.5 Local government action
Domestic Governance
55
The 2024 Government Work Report and field investigations by national leaders reveal interest in frontier AI development and AI-driven applications, but little focus on safety.
Domestic Governance
56
China does not view capabilities development and safety/security as zero-sum, simultaneously increasing efforts in both directions.
Domestic Governance
57
China is in the process of developing a national AI law, and 2 separate expert drafts have been released to date.
Domestic Governance
58
Both expert drafts orient around promoting AI development and also contain provisions relevant to frontier AI safety.
Key provisions | ||
Licensing requirement for models with certain risky profiles | ✅ | ❌ |
New government agency for AI | ✅ | ❌ |
Tax credits for “safety governance” research or equipment | ✅ | ❌ |
Specialized oversight for foundation models above a certain (unspecified) size | ✅ | ✅ |
Provision on AGI value alignment | ❌ | ✅ |
Financial penalties for violations by AI developers | ✅ | ✅ |
Liability exemptions for open-source AI | ✅ | ✅ |
Domestic Governance
59
4.1 Overarching national guidance
4.2 National regulations and policies: National scientific funding has begun devoting greater attention to AI safety, but no new regulations have emerged on AI safety.
4.3 Science and technology ethics system
4.4 Voluntary standards
4.5 Local government action
Domestic Governance
60
Over the past 6 months, China has not issued any new binding regulations relating to frontier AI.
Registration Information for Generative AI Services (as of March 2024) | |||||
Order | Location | Model name | Registering company | Registration number | Time of registration |
1 | Beijing | ERNIE Bot (文心一言) | Baidu | Beijing-WenXinYiYan-20230821 | 2023/8/31 |
2 | Beijing | ChatGLM (智谱清言) | Zhipu AI | Beijing-ChatGLM-20230821 | 2023/8/31 |
3 | Beijing | Skylark (云雀大模型) | ByteDance | Beijing-YunQue-20230821 | 2023/8/31 |
Domestic Governance
61
The National Natural Science Foundation of China (NSFC) announced that it is accepting applications for the first projects on value alignment.
Institution | Date | Total Funding | Safety proportion | Types of safety research the grant can support |
NSFC | 3 million RMB (~$400,000) | 2 of 6 research directions | Large model value and safety alignment strategy; automated evaluation methods including safety and security. | |
NSFC | 20 million RMB (~$2.8 million) | 1 of 11 research directions | ||
NSFC | 2.6 million RMB per project (~$360,000) | 1 of 19 research directions with China Unicom | Large speech synthesis models, including value alignment and bias. |
Domestic Governance
62
Chinese national security officials and organizations have become more publicly vocal about AI’s threats to national security, including brief references to AI safety risks.
Domestic Governance
63
A government-affiliated think tank discussed AI risks and recommended value alignment.
Domestic Governance
64
4.1 Overarching national guidance
4.2 National regulations and policies
4.3 Science and technology ethics system: There have been no policy updates on S&T ethics reviews, and little new information has emerged on how these are operationalized within companies and research institutions.
4.4 Voluntary standards
4.5 Local government action
Domestic Governance
65
4.1 Overarching national guidance
4.2 National regulations and policies
4.3 Science and technology ethics system
4.4 Voluntary standards: New standards have been issued on AI safety and security. They currently prioritize content security, but there is growing interest in frontier capabilities and safety testing.
4.5 Local government action
Domestic Governance
66
Government standards bodies have begun work on standards that could be relevant for frontier AI safety, and industry actors are also pursuing safety benchmarks.
Domestic Governance
67
China finalized its first national standard on generative AI security in February, focused on content security with a brief mention of frontier safety risks.
Domestic Governance
68
4.1 Overarching national guidance
4.2 National regulations and policies
4.3 Science and technology ethics system
4.4 Voluntary standards
4.5 Local government action: Local government policies focused on AI development also touch on frontier safety issues, such as strengthening risk foresight, safety testing for models, and promoting model alignment.
Domestic Governance
Most of the key provincial-level jurisdictions for AI have released policies on AGI or large models.
69
Domestic Governance
70
Testing of frontier AI safety measures in the provinces could inform and foreshadow future national actions.
Safety-relevant measures | ||||||
Alignment | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
Early warning of risks/disasters | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ |
International cooperation | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ |
Pre-deployment supervision | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
S&T ethics | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ |
Safety or security testing and evaluation | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ |
Watermarking and provenance | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ |
Domestic Governance
71
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
72
Overview of key developments since October 2023
Lab and industry practices
73
5.1 Industry alliance projects: At least 2 influential industry alliances are actively engaged in initiatives on AI safety, security, and governance.
5.2 Safety of published models
5.3 Corporate ethics and governance work
Lab and industry practices
74
AIIA and the Cyber Security Association of China (CSAC) are major government-backed players pursuing projects on AI safety, security, and governance.
Lab and industry practices
75
Thus far, AIIA’s Safety and Security Governance Committee has been the most active on frontier safety, though the Policy and Law working group has also shown interest.
Lab and industry practices
76
Meanwhile, CSAC has been focused on corpus development, safety or security testing, and multimodal AI.
Lab and industry practices
77
5.1 Industry alliance projects
5.2 Safety of published models: Over the past 6 months, 3 additional labs released details about safety measures for models they published, but they appear to have taken little action on frontier AI safety.
5.3 Corporate ethics and governance work
Lab and industry practices
78
SHLAB, Zhipu AI, and DeepSeek’s disclosures reveal some efforts to align models to human intentions and prevent toxic content, but not testing for more frontier risks.
Lab and industry practices
79
5.1 Industry alliance projects
5.2 Safety of published models
5.3 Corporate ethics and governance work: Details about how companies implement AI ethics and governance measures are largely unknown, though Ant Group asserts it has made significant investments. Other companies have produced reports analyzing frontier AI risks.
Lab and industry practices
80
Ant Group claims that 20% of its large model technical personnel work on S&T ethics, but this is difficult to verify.
Lab and industry practices
81
Recent reports by commercial actors have also begun to discuss frontier AI risks in greater sophistication and lay out company efforts to combat such risks.
Lab and industry practices
82
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
83
Overview of key developments since October 2023
Expert Views
84
6.1 International coordination: Top Chinese and foreign experts have signed a consensus statement on key aspects of frontier AI risks, policy recommendations, and red lines in a recent dialogue.
6.2 R&D funding devoted to AI safety
6.3 AI and biological security
6.4 Discussion in party venues
Expert Views
85
The 2 IDAIS meetings show that a number of influential Chinese and Western experts agree on measures for ensuring safety of frontier AI models.
Expert Views
86
6.1 International coordination
6.2 R&D funding devoted to AI safety: The idea of devoting a minimum level of national and corporate R&D funding to AI safety or governance research has received some attention and support in Chinese domestic discourse.
6.3 AI and biological security
6.4 Discussion in party venues
Expert Views
87
Kai-Fu Lee and several other leading Chinese AI experts expressed support for minimum funding or resourcing levels for AI safety.
Expert Views
88
6.1 International coordination
6.2 R&D funding devoted to AI safety
6.3 AI and biological security: There is nascent discussion in policy advisory circles about the risks of AI combined with biological risks.
6.4 Discussion in party venues
Expert Views
89
While these discussions are nascent, all actors who have weighed in are influential policy advisors.
Expert Views
90
6.1 International coordination
6.2 R&D funding devoted to AI safety
6.3 AI and biological security
6.4 Discussion in party venues: Warnings regarding the potential risks of frontier AI have also begun to arise in venues directed more towards party elites than scientific audiences.
Expert Views
91
2 leading experts discussed frontier AI safety risks in notable party venues in recent months.
Expert Views
92
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
93
Overview of key developments since October 2023
Public Opinion
* These conclusions should be treated with caution due to the lack of representative polls.
94
There was only one new relevant poll over the past six months, which did not yield any clarifying results.
Public Opinion
95
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
96
Thank you for reading our report!
Additional Resources
97
Key acronyms (1)
Additional Resources
AGI | Artificial General Intelligence | 通用人工智能 |
AIIA | Artificial Intelligence Industry Alliance of China | 人工智能产业发展联盟 |
BAAI | Beijing Academy of Artificial Intelligence | 北京智源人工智能研究院 |
BIGAI | Beijing Institute for General Artificial Intelligence | 北京通用人工智能研究院 |
CAC | Cyberspace Administration of China | 网信办 |
CAICT | China Academy of Information and Communications Technology | 中国信息通信研究院 |
CAIS | Center for AI Safety | 人工智能安全中心 |
CAISG | Peking University Center for AI Safety and Governance | 人工智能安全与治理中心 |
CASS | Chinese Academy of Social Sciences | 中国社会科学院 |
CBRN | Chemical, Biological, Radiological and Nuclear | 化学、 生物、 放射和核 |
CNCERT/CC | National Computer Network Emergency Response Technical Team/Coordination Center of China | 国家计算机网络应急技术处理协调中心 |
98
Key acronyms (2)
Additional Resources
CoAI | Tsinghua Conversational AI research group | 交互式人工智能课题组 |
CPC | Communist Party of China | 中国共产党 |
CSAC | Cyber Security Association of China | 中国网络空间安全协会 |
CUPL | China University of Political Science and Law | 中国政法大学 |
CVDA | Peking University Computer Vision and Digital Art Lab (CVDA lab) | 计算机视觉与数字艺术实验室 |
DRC | Development Research Center | 国务院发展研究中心 |
GAIR | Shanghai Jiao Tong University Generative Artificial Intelligence Research Lab | 生成式人工智能研究组 |
HKUST | Hong Kong University of Science and Technology | 香港科技大学 |
I-AIIG | The Institute for AI International Governance of Tsinghua University | 清华大学人工智能国际治理研究院 |
99
Key acronyms (3)
Additional Resources
IDAIS | International Dialogues on AI Safety | 人工智能安全国际对话 |
LLM | Large Language Model | 大语言模型 |
MIIT | Ministry of Industry and Information Technology | 工信部 |
MOFA | Ministry of Foreign Affairs | 外交部 |
MOST | Ministry of Science and Technology | 科技部 |
MSRA | Microsoft Research Asia | 微软亚洲研究院 |
MSS | Ministry of State Security | 国安部 |
NDRC | National Development and Reform Commission | 发改委 |
NPC | National People’s Congress | 全国人民代表大会 |
NSFC | National Natural Science Foundation of China | 国家自然科学基金委员会 |
PAIR | PKU Alignment and Interaction Lab | 北大AI对齐团队 |
RLHF | Reinforcement Learning from Human Feedback | 人类反馈强化学习 |
100
Key acronyms (4)
Additional Resources
SAC | Standardization Administration of China | 中国标准化管理委员会 |
SHJT | Shanghai Jiao Tong University | 上海交通大学 |
SHLAB | Shanghai Artificial Intelligence Laboratory | 上海人工智能实验室 |
TC260 | National Information Security Standardization Technical Committee, or National Technical Committee 260 on Cybersecurity of Standardization Administration of China | 全国信息安全标准化技术委员会 |
THUNLP | Natural Language Processing Lab at Tsinghua University | 清华大学自然语言处理与社会人文计算实验室 |
TJUNLP | Tianjin University Natural Language Processing Laboratory | 天津大学自然语言处理实验室 |
UNGA | United Nations General Assembly | 联合国大会 |
WAIC | World Artificial Intelligence Conference | 世界人工智能大会 |
101
Section 1: Introduction and scope
Section 2: Technical safety research
Section 3: International governance
Section 4: Domestic governance
Section 5: Lab and industry practices
Section 6: Expert views on AI risks
Section 7: Public opinion on AI
Section 8: Additional resources
Section 9: About us
Table of Contents
102
About Concordia AI (安远AI)
About Us
103
We have 3 main areas of work. See our 2023 Annual Review for more details.
About Us
Focus 2:
Technical AI safety field-building in China
Focus 1:
Advising on Chinese AI safety and governance
Focus 3:
Promoting international cooperation
104
Conflicts of interest
About us
105
About Us
Substack
Translated Expert Articles
Follow our work through our Substack newsletter, translations of AI expert views, and more!
Website
WeChat official account
Scan using WeChat
106
This report was authored by Jason Zhou, Kwan Yee Ng, and Brian Tse.
We would like to express our sincere gratitude to the entire Concordia AI team for their tireless contributions throughout the development of this report. Their constructive feedback and dedicated analytical support were instrumental in shaping the content and ensuring the quality of our work. We are also deeply indebted to our network of affiliates and collaborators for multiple rounds of meticulous review of various sections of the report. Their assistance in curating our database of frontier AI safety papers was an essential foundation for our research. We additionally thank our expert reviewers for their suggestions on improving the methodology for the technical AI papers database and other valuable feedback.
Acknowledgements
About Us
© 2024 Concordia AI, all rights reserved.