Lost in Translation: Investigating Systematic Discrepancies between Parallel English and Chinese Names of American Chinese Restaurants (ACRs)
Hang Jiang, Nanxi Liu, Cassandra Overney, Rukun Zhang, Artemisia Luk, Diyi Yang,
Jad Kabbara, Deb Roy
MIT Center for Constructive Communication, Stanford University
Background
It begins with a story…
Generated by ChatGPT-4o
Szechuan vs. Hangzhou
Translation:
Intoxicated Hangzhou
More Mismatched Names
Lingnan Restaurant
Taste of Sea
Jiangnan Mansion
Szechuan Mountain + Pangolin (pun)
Why Does It Matter?
For these Chinese immigrants, the parallel names of their restaurants reflect both an embrace of cultural assimilation and a preservation of their home identity.
Prior Work
Prior work has primarily focused on common English ACR names – “panda”, “china”, “wok”, “great wall” (Chen, 2018).
No one has studied the parallel English and Chinese ACR names.
Research Questions (RQs)
Dataset & Method
Name Transcription
Sampled and annotated 3166 ACRs (10% in U.S.)
Chinese Name: 快乐人
Translation: Happy Guy
English Name: Uncle Lou
8 Name Frames
Adapted from Chen (2018)
Frame Annotation + Model Evaluation
Model Evaluation
Models (e.g., Rule-based, ChatGPT)
1
Model Comparison
Validation
Please decide if a restaurant name contains any personal name.
Definition
Personal names are usually, but not necessarily, surnames or first names.
Here are some common examples:
- Surnames with a possessive: Qing’s Kitchen, Hoy’s Wok
- Names without a possessive: China Lee, Hunan Mao, House of Louie
If the name of the restaurant ''Uncle Lou'' contains any personal name, return 1; if not, return 0.
Codebook → Instructions
Human Annotation
1
0
1
3 annotators
Rounds of discussion
Decision
Quantifying Naming Discrepancies
1. Frame-based Jaccard Distance (JD)
2. Embedding-based Cosine Distance (CD)
Interviews in Chinatown
Analysis
RQ1: What Is Lost in Translation?
RQ2: Location vs. Naming Discrepancies
ACRs in urban areas show higher discrepancies than those in rural and suburb areas
RQ2: Chinese Percentage vs. Naming Discrepancies
ACRs in areas with higher Chinese percentage show higher discrepancies
RQ3: Explanations of Phenomena
1. Customer base
2. Neighborhood
3. Personal experience
R7 from Chinatown: “… unique to stand out so that they can have business …”
R9 without Chinese name : “There are no Chinese customers here …”
R3 with personal name: “I want to show love to my daughter …”
Takeaways
Thank you!�hjian42@mit.edu
Performance