GMU Systems for the IWSLT 2023 Dialect and Low-resource Speech Translation Tasks
Jonathan Kabala Mbuya
Antonios Anastasopoulos
{jmbuya, antonis}@gmu.edu
1
Motivation
https://mitratranslations.com/en/tag/website-language-vesion/
Communication is at the core of human societies
Over 7100 languages in the world
Languages play a key part in communication
Translation helps with cross language
communication
Machine & Speech Translation can help
Challenges
2
Dialectal and Low-resource Speech Translation
Potential Solution
Machine Translation
Speech Translation
3
IWSLT 2023 Tasks
5 of the 6 available Low-resource Tasks
1 Dialectal Task
Task | Train Set Hours | Task type |
Irish to English | 11 | Low-resource |
Marathi to Hindi | 15.3 | Low-resource |
Pashto to French | 61 | Low-resource |
Tamasheq to French | 17 | Low-resource |
Quechua to Spanish | 1.6 | Low-resource |
Tunisian Arabic to English | 160 | Dialectal |
Constrained
Unconstrained
Additional Data
4
Proposed Methods: Baseline Models
End-to-end speech translation (E2E)
End-to-end speech translation with ASR encoder initialization (E2E-ASR)
5
Proposed Methods: Using Self-Supervised Speech Models
Wav2vec 2.0: W2V-E2E
Hubert: Hubert-E2E
XLSR-53: XLSR-E2E
6
Layer Removal Results on Wav2vec 2.0
Based on Pasad et al., 2022
7
Low-resource Task Results
8
Low-resource Task Results
9
Dialectal Task Results
10
Conclusion