1 of 16

SDG METADATA �TRANSLATION PILOT

PHASE 1 RESULTS AND PROPOSED NEXT STEPS

May 1, 2020

SDG Metadata Translation Team

2 of 16

1. Purpose

  • Capacity building is necessary for all countries to prepare the data and statistics needed to meet the scope and ambitions of the 2030 Agenda.
  • A fundamental aspect of capacity building is ensuring that the data and statistics are prepared in a transparent and accessible manner.
  • Yet, SDG metadata are not available in languages other than ENG and ARB.

SDG Metadata Translation Pilot

1

3 of 16

1. Purpose (continued)

  • Preparing such translations is extremely time consuming and therefore costly. There are 231 * ~10 pages of technical language for each language.
  • Version control is a challenge. The indicators are refined and revised over time, and it is difficult for users to know what changed and when.
  • A solution is needed that would facilitate efficient and sustainable means to translate SDG metadata.

SDG Metadata Translation Pilot

2

4 of 16

2. Objective

The objective of the SDG Metadata Translation project is to examine the feasibility, efficiency, and ease of use of computer-assisted translation for the SDG metadata files.

Requirements:

  • Manage thousands of pages across hundreds of files
  • Available in multiple languages
  • Open source (free or very low fee)
  • Suitable for technical language
  • Easy for anyone to use

SDG Metadata Translation Pilot

3

5 of 16

3. Approach

  1. We worked with several partners on various aspects of the pilot project.
  2. The World Bank funded project management and data science work to build and evaluate the pilot translation protocol.
  3. UNECE and ROSSTAT funded the translation of the selected indicator metadata files into Russian.
  4. UNSD’s SDMX and IAEG-SDGs teams consulted regarding the mapping of metadata concepts in our preparation of machine readable files.

SDG Metadata Translation Pilot

4

6 of 16

4. Method

  1. We selected SDG indicator files from UNSD Metadata Repository. For the pilot, we selected 10 for which the World Bank is a custodian agency.

SDG Metadata Translation Pilot

5

7 of 16

4. Method (continued)

  1. We applied the SDMX metadata concepts to the IAEG-SDG metadata file to create an SDG SMDX compliant machine readable file. This allows automation.

SDG Metadata Translation Pilot

6

8 of 16

4. Method (continued)

  1. We uploaded the machine readable metadata file into Github, an open source development platform. It is free for anyone to manage and share files.

SDG Metadata Translation Pilot

7

9 of 16

4. Method (continued)

  1. We downloaded the files into Weblate, an open source translation platform. It is free to use for open source projects. Anyone can access it.

SDG Metadata Translation Pilot

8

10 of 16

4. Method (continued)

  1. A human translator logs into Weblate, selects the SDG project and indicator.

SDG Metadata Translation Pilot

9

11 of 16

4. Method (continued)

  1. The human translator selects “Start new translation” and selects the language for translation. Then, s/he clicks “translate.” This project translates SDG metadata into Russian.

SDG Metadata Translation Pilot

10

12 of 16

4. Method (continued)

  1. The human translator chooses among the several translations offered at the bottom on the page. S/he can edit if necessary and add terms to the glossary.

SDG Metadata Translation Pilot

11

13 of 16

4. Method (continued)

  1. When a translation is complete, it is automatically uploaded into Github. Custom code then populates the project website with the translated files.

SDG Metadata Translation Pilot

12

14 of 16

6. Results: translation platform

  • Our human translator estimates a 35 percent reduction in translation time using Weblate compared to manual methods.
  • He found it easy to use. He required no training other than reviewing the instruction paper from the project website.
  • We believe that even greater efficiencies can be achieved as human translators become more familiar with the protocol.
  • The translated files are transmittted automatically from Weblate to Github. When we “pull” the update from Github, the translated files automatically are populated on our project website in machine and human readable formats.

SDG Metadata Translation Pilot

13

15 of 16

6. Results: machine reading tool

  • Our pilot method for preparing machine-readable files was the most developed approach using the draft SDMX metadata concepts. But, it remained time-intensive and did not reflect finalized metadata concepts of the SDG SDMX Working Group.
  • In post-production pilot work, and in consultation with the SDG SDMX Working Group, we updated and streamlined the process for preparing machine readable files. This will
    • save time and cost for our World Bank project,
    • enable country SDG metadata flows, and
    • assist the IAEG-SDG with indicator version control.

SDG Metadata Translation Pilot

14

16 of 16

7. Next steps

  • Our protocols, machine readable files in ENG and RUS, and presentations are available on our website at https:/worldbank.github.io/sdg-metadata/
  • We proposed Phases 2 and 3 and are waiting for approval. Phase 2 (rest of tier 1 indicator metadata) would be translated into RUS by summer; phase 3 (tier 2 indicator metadata) would be translated into RUS by fall.
  • We are discussing the translation tool with StatCan and INSEE to prepare FRE machine readable metadata files. These would be available to all.

SDG Metadata Translation Pilot

15