Request edit access
CAMeL Lab Registration Form: "MADAR-Turk Corpus"
The MADAR-Turk corpus adds Turkish sentences to the MADAR Corpus (Bouamor et al., 2018), which provided the first set of parallel sentences to include the dialects of 25 Arab cities in addition to English, French, and MSA. The MADAR Corpus was built on the Basic Traveling Expression Corpus (BTEC) (Takezawa et al., 2007) and comprised about 20,000 English tourism-related sentences. BTEC is conversational in nature, has short sentences, and has translations in several languages, making it an attractive resource for building and testing machine translation models.

To create MADAR-Turk, two native Arabic speakers from Syria who are highly fluent in Turkish translated all 2,000 sentences from the Damascus dialect entries because our initial objective was to work on Syrian Arabic to Turkish machine translation.

The sentences came from the following sub-splits in the MADAR Corpus:
200 corpus-6-test-corpus-26-devĀ 
200 corpus-6-test-corpus-26-test
1600 corpus-6-test-corpus-26-train
Sign in to Google to save your progress. Learn more
Email *
First Name *
Last Name *
Affiliation *
Website (optional)
What do you plan to use this resource for? *
License - please read the following license:
//////////////////////////////////////////////////////////////////////
// License for MADAR-Turk Corpus
//////////////////////////////////////////////////////////////////////

The MADAR-Turk Corpus is provided under an Attribution-ShareAlike 4.0
International (CC BY-SA 4.0) License: https://creativecommons.org/licenses/by-sa/4.0/

If you use this corpus split, please cite:
Hasan Alkheder, Houda Bouamor, Nizar Habash and Ahmet Zengin. Benchmarking Dialectal Arabic-Turkish Machine Translation. In Proceedings of the Machine Translation Summit, Macau SAR, China, 2023.

//////////////////////////////////////////////////////////////////////
By clicking "Yes" you agree to the terms of this license. *
Citing Guide
When citing this resource, please use: Hasan Alkheder, Houda Bouamor, Nizar Habash and Ahmet Zengin. Benchmarking Dialectal Arabic-Turkish Machine Translation. In Proceedings of the Machine Translation Summit, Macau SAR, China, 2023.
By clicking "Yes" you agree to use this citing guide. *
Publications
Hasan Alkheder, Houda Bouamor, Nizar Habash and Ahmet Zengin. Benchmarking Dialectal Arabic-Turkish Machine Translation. In Proceedings of the Machine Translation Summit, Macau SAR, China, 2023.
Submit
Clear form
Never submit passwords through Google Forms.
This form was created inside of New York University.

Does this form look suspicious? Report