1 of 11

The Silent Curriculum: How Does LLM Monoculture Shape Educational Content and Its Accessibility?

Supriti Vijay

Digital Marketing,

Adobe

Aman Priyanshu

School of Computer Science,

Carnegie Mellon University

2 of 11

Introduction

Large Language Models (LLMs) are becoming the primary source of knowledge
LLMs may propagate a singular perspective, creating an "LLM Monoculture"
The "Silent Curriculum" shapes children's learning through LLM responses

3 of 11

Experimental Setup

Utilized GPT-3.5 and LLaMA2-70B as subjects
Generated an Ethnicity and Top 20 Occupations corpus
LLMs created short stories about children's success in given occupations

4 of 11

Generating the Occupational-Racial Bias Benchmark

LLMs generated occupations for each ethnic group
Examples: White - "Corporate Executive", Black - "Music Producer", Asian - "Software Engineer"
Occupations reflect cultural and stereotypical biases

We present our prompt for generation of these biased race-occupation pairs in Appendix A.1

5 of 11

Children's Story Generation

LLMs crafted narratives about a child's journey to success in a specific occupation
Prompts excluded direct mentions of ethnicity, focusing on occupation
Aimed to probe implicit biases in the models' storytelling

6 of 11

Self-Annotating Ethnic/Racial Groups

LLMs annotated ethnicities of characters in the generated stories
Investigated self-consistency of models and disparities between biases and narratives
Calculated cosine similarity scores to quantify consistency in cultural representation (Reference: Appendix A.3)

7 of 11

Results and Discussion

8 of 11

Comparison of LLM-specified occupational ethnicity counts against the inferred ethnicity of the protagonist's name. The heatmap illustrates discrepancies in portrayals, revealing potential biases in cultural representations within AI-generated narratives.

GPT3.5

LLaMA-70B

9 of 11

Heatmap illustrating discrepancies between LLM-specified occupational ethnicity counts and the inferred country. Provides insights into potential biases in cultural depictions within AI-generated content.

GPT3.5

LLaMA-70B

10 of 11

Questions We Need to Ask (Provocation)

The generation of biased occupation-racial/ethnic pairs: Is it safe?

LLMs generating stereotypical associations between occupations and ethnicities
Potential to reinforce harmful biases and limit children's aspirations

Representation disparity in children's stories generated by LLMs

Certain races more likely to occur, even when LLMs associate differently
Are safety fine-tuning techniques biasing models to be "fair" according to inconsiderate benchmarks?

High cosine similarity (0.86 and 0.87) across these two LLMs, at least, suggests bias convergence

Likely due to the use of same/similar datasets and pre-training methods
What are the implications of a homogenized AI perspective on society?

11 of 11

Conclusion

LLMs outputs are influenced by societal biases and cultural narratives
The "Silent Curriculum" may shape children's learning and perpetuate stereotypes
A collective effort is needed to challenge and broaden the AI monoculture