Exploring Song Lyrics through Digital Text Analysis Tools
Xianzhong Meng
T Cruz
2025-10-30
CONTENTS
2. Workshop Overview
2.1 Workshop Overview
2.1 Workshop Overview
2.1 Workshop Overview
2.2 Workshop Overview - T
2.3 Workshop Overview - Xianzhong
3. T’s Session (0–30 min)
Building the Corpus
�
Kaggle is a data science company, whose website has datasets you can input into softwares like Lancsbox
Purpose: To trace how pop lyrics express awareness, resistance, and/or emotional refuge�
Method: keyword frequency, collocation, and topic modeling
Finding Keywords
Beyonce | freedom, black, power, woman | empowerment, spirituality |
Taylor Swift | speak, truth, right, voice | nostalgia, agency |
Billie Eilish | real, die, world | authenticity, anxiety |
Artist Keywords Thematic Focus
|
Frequencies and Patterns of Activist Language
Conclusion: A keyword analysis reveals consistent rhetorical patterns: identity, resistance, and emotional truth-telling appear across all artists but differ in tone and intensity.
Common thread: All artists turn personal experience into social commentary
Key takeaways: Activism in lyrics isn’t always overtly political; sometimes it’s emotional or identity-driven. Beyoncé embodies collective liberation. Swift reframes gendered power. Eilish critiques authenticity and mental health stigma.
Music as Rebellion & Safe Space
Rebellion against power systems:
Music as a safe space:
Intersecting Voices of Resistance
Comparative Insights
Key takeaway: These three voices show generational and cultural differences in expressing resistance:
Beyoncé externalizes activism
Swift negotiates identity in the public eye
Eilish internalizes rebellion
Conclusion & Possible Future Work
Music as a Rhetoric of Empowerment
4. Xianzhong’s Session (30 min)
4.1 Exploring lexical diversity
4.1 Exploring lexical diversity
|
|
Song | Lyrics Sample | How many tokens | How many types | TTR |
Repetitive lyrics | “Love, love, love, love, love, love, love, love, love…” | | | |
Even though it’s short, the same word repeats many times—so token count is high, but lexical diversity is low. | ||||
Varied sentence | “Love begins softly, grows stronger, and never ends.” | | | |
Fewer repetitions and more unique words—total tokens are lower but lexical diversity is higher. | ||||
9
1
1/9
8
8
8/8
4.1 Exploring lexical diversity
|
|
For example: “Love is freedom, and freedom is love forever and ever.” | ||||
Number | Word size (5) | Types | Tokens | TTR |
1 | Love is freedom and freedom | | | |
2 | is freedom and freedom is | | | |
3 | freedom and freedom is love | | | |
4 | and freedom is love forever | | | |
5 | freedom is love forever and | | | |
6 | is love forever and ever | | | |
Moving average type-token ratio: MATTR5=(0.8+0.6+0.8+1.0+1.0+1.0)÷6=0.8667 | ||||
4
5
⅘=0.8
3
5
⅗=0.6
4
5
⅘=0.8
5
5
5/5=1
5
5
5/5=1
5
5
5/5=1
4.1 Exploring lexical diversity
Differences between TTR and MATTR
|
|
Feature | TTR (Type-Token-Ratio) | MATTR (Moving-Average TTR) |
How it’s calculated | Unique words ÷ total words | Average TTR across multiple 50-word windows |
Effect of text length | Changes a lot when text is long or short (unstable) | Stays consistent even when texts differ in length (more reliable) |
What it shows | A basic snapshot of vocabulary variety | A smoother, more accurate measure of lexical diversity |
4.2 Exploring lexical diversity
Measures | Meaning | Lyrics |
Tokens | Total number of words in the text | Song length—how much language is used overall |
MATTR� (Moving-Average TTR) | Average lexical diversity over 50-word windows | Shows how varied or repetitive the vocabulary is, adjusted for text length |
4.2 Exploring lexical diversity
Corpus | Sont Content | Tokens | MATTR (%) |
Beyoncé | 406 | 189,871 | 24.35 |
|
|
4.2 Exploring lexical diversity
Corpus | Sont Content | Tokens | MATTR(%) |
Taylor Swift | 479 | 247,682 | 19.42 |
|
|
4.2 Exploring lexical diversity
Corpus | Song Content | Tokens | MATTR (%) |
Billie Eilish | 145 | 64,231 | 37.18 |
|
|
|
|
Corpus | Sont Content | Tokens | MATTR (%) |
Beyoncé | 406 | 189,871 | 24.35 (<0.6) |
Repetitive rhythmic lyrics; moderate diversity typical of R&B and pop hooks. | |||
Taylor Swift | 479 | 247,682 | 19.42 (<0.6) |
Extensive narrative songwriting with thematic repetition. | |||
Billie Eilish | 145 | 64,231 | 37.18 (<0.6) |
Highest lexical diversity; introspective and experimental writing. | |||
THE END
THANKS