Multidimensional Quality Metric Quality Issue Types

version 2.5.5 (2013 September 3)

Previous versions:

2.5 (2013 August 28) • Link

2.4 (2013 June 24) • Link

2.3 (2013 June 17) • Link

2.2. (2013 June 2) • Link
2.1 (2013 May 21) • Link
2.0 (2013 January 18) • Link

Note: This version is preliminary and the content, name, and value of nodes within the hierarchy may move. Citations to this hierarchy must reference the version number and URL to ensure accuracy.

This document describes the issues only. Information on dimensions and scoring methods is maintained separately.

Summary:

This document provides an overview of the structure of the Multidimensional Quality Metrics (MQM)’s set of issue types. It includes a description and examples of each type, along with graphical representations of the overall structure. This document addressed only product quality issues (i.e., those related to the translation product) and does not address project or production quality.

This version differs significantly from earlier versions in that it makes a distinction between core and extension issue types. Prior to this distinction, MQM had a high degree of complexity that was overkill for most applications. By moving the complexity into modules and maintaining a compact and simple core, MQM implementation will be easier for the majority of cases and it will be clearer when additional complexity should be invoked.

The list of MQM issue types defines a catalog of issues types relevant for assessing the quality of both translated texts and monolingual documents. While many of the issue types will not apply in the case of monolingual documents, the majority do apply and can be used to evaluate source document quality relative to the quality of translated documents.

Contents

Background

Scope

Issue vs. error

According to Specifications…

Scoring

Evolution and Sources

Background

Scope

Multidimensional Quality Metrics (MQM) defines a set of issue types related to translation product quality. It does not address translation project- or production-related issues, even though a full consideration of translation quality would address these issues. Readers interested in these aspects are invited to consult the EN 15038 and ISO 9000-series standards, as well as relevant section of ISO/TS-11669.

Issue vs. error

The term issue as used in this document refers to any potential error detected in a text, even if it is determined not to be an error. For example, if an automated process finds that a term in the source does not appear to have been translated properly, it has identified an issue. If human examination finds that the term was translated improperly, it is an error. However, examination might also find that the issue was not an error because the linguistic structure in the translation dictated that the term be replaced by a pronoun, so the translation is correct.

In most cases of translation quality assessment, issues will be errors, but with automated issue detection, some issues will not be errors. Accordingly this document refers to issues in most contexts.

According to Specifications…

Although not covered in this document, the concept of specifications (dimensions) is vital in MQM. More can be learned about specifications/dimensions at the QTLaunchPad website. Specifications help determine what should be counted as an error. For example, if specifications state that a text is being translated for use in a regulated industry, issues related to legal claims will be important in a way that they would not be for a text intended as humorous commentary on contemporary German politics.

As a result, issues/errors should be counted only with respect to the specifications and corresponding metric chosen, including any locale conventions. For example, an informative text about Hungarian culture might mention that Hungarian names use a family name-first ordering convention. If this text were translated in Hungarian, however, this explanation would be omitted since any educated reader would already know about Hungarian naming conventions. In such cases omission would not be an error and should not be counted against the translator.

In most cases issues that could be errors may not be errors if done intentionally and appropriately by the translator. Reviewers need to be aware of and competent in interpreting specifications and metrics to avoid improper penalization of translators.

Scoring

The default MQM scoring method is via error counts. Errors and their severities are counted to assign penalties, which are deducted from a theoretical perfect score of 100% to deliver a percentage quality score. Individual issue types can also be “weighted” to give them more or less importance.

Scoring is described on the QTLaunchPad website.

Evolution and Sources

The following issue types and structure are based on an analysis of existing human translation-oriented quality metrics and systems. It represents a non-strict superset of the issues found in existing systems. It is a non-strict superset because it does not contain the full granularity of some existing systems. For example, the Checkmate quality system (part of the open-source Okapi framework) includes very detailed issue types for dealing with whitespace, which are subsumed into a single category in MQM. With the exception of such issues where some granularity may be lost, the existing quality assessment systems can be mapped to MQM and described in terms of the issue types listed in this document.

Quality assessment metrics/tools consulted in creating MQM include the following:

LISA QA Model
SAE J2450
ISO 14080
SDL TMS Classic
ApSIC XBench
CheckMate
XLIFF:doc
QA Distiller
ATA Certification test

High-Level Structure

MQM consists of a set of “issue types,” potential errors that can be detected in texts. Although MQM is oriented towards assessing the quality of translations, many of the issues can be applied to monolingual texts as well. The issues in MQM are organized in a hierarchy. At the highest level, they are grouped in five categories (plus Other), as shown below:

Figure 1. Top-level structure

Of these, three categories are considered “core”: Accuracy, Fluency, and Verity. Three additional top-level categories are treated as modules that may be used for special purposes: Design, Internationalization, and Compatibility.^[1] In addition, the category Other is reserved for any issues that are not otherwise covered in MQM.

The definition of these top-level categories are as follows:

Core categories:

Accuracy. Accuracy addresses the extent to which the target text accurately corresponds to the source text. For example, if a translated text tells the user to push a button when the source text tells the user not to push it, there is an accuracy issue.
Fluency. Fluency relates to the monolingual qualities of the source or target text, relative to agreed-upon specifications, but independent of the relationship between source and target. In other words, fluency issues can be assessed without regard to whether the text is a translation or not. For example, a spelling error or a problem with register remain issues regardless of whether the text is translated or not.
Verity. Verity issues address the relationship between the text and its world, such as whether the text is adequate or sufficient in its representation of the world or suffers from other problems with regard to usability. For example, if a text is translated accurately and fluently but contains material claims that may be acceptable in the source market but which are illegal in the market for which it was translated, these deficiencies are addressed under Verity.

Modules

Design. Design issues relate only to those cases in which formatting or styling apply to text. If, for example, text should be in italic but instead appears as bold, that is a design issue. The Design module will be used in many common situations, but is not considered core because it does not apply to translation in general.
Internationalization. Internationalization issues address problems arising because content was not properly prepared for translation/localization. For example, if a form hard codes in a German date format, it will not be suitable for use in the U.S., where a different date format is expected. (Note: at present, the Internationalization category is a “stub” with no subcategories that is used for all sorts of internationalization errors. This branch may be expanded in the future.
Compatibility. Compatibility contains a list of issue types that would not normally be addressed in MQM, either because they deal with process or business relationship issues, or because they address functional issues not associated with text per se (e.g., whether a localized application can work properly with other software). They are included to allow compatibility between MQM and legacy metrics (notably the LISA QA Model).

Each of the branches listed above expands into a list of specific issue types, arranged hierarchically (with the exception of Compatibility, where the issues are a flat list, and Internationalization, which is presently unelaborated). The following sections will describe the structure of these branches.

MQM Core

Within each of the core branches, some issue types are considered “core” and others are present in extended modules that can be invoked as needed. The MQM consists of a core of 19 issue types. These issue types are relatively high-level issues that can account for most issues related to translation itself. The core can be represented as follows:

Figure 2. Core issue types. (An asterisk (*) after an issue name indicates issues that are amenable to automatic detection)

(The labels Content and Mechanical are for convenience in grouping issues.^[2])

The core contains a total of 19 issue types, defined below. It is not anticipated that assessment tasks will use all 19 categories, but rather will use a relevant selection.

NOTE: within any branch of MQM, ordering is significant: If multiple issue types could apply to an issue, the first relevant one should be selected. See the section Guidelines for selecting issue types (below) for more details on selecting issue types.

MQM Core Issue Types

Note that the three high-level branches serve as issue types in the core in their own right. They are used for any cases of issues that fall under their scope but which are not defined by a subtype.

Accuracy (apply to target text only)

The target text does not accurately reflect the source text, allowing for any differences authorized by specifications.

Note: Most cases of Accuracy are addressed by one of the more specific subtypes listed below.

Terminology
A term is translated with a term other than the one expected for the domain or otherwise specified.

Example: A French text translates English “e-mail” as “e-mail” but terminology guidelines mandated that courriel be used.

Example: The English musicological term dog is translated (literally) into German as Hund instead of as Schnarre, as specified in a terminology database.

Mistranslation
The target content does not accurately represent the source content.

Example: A source text states that a medicine should not be administered in doses greater than 200 mg, but the translation states that it should not be administered in doses less than 200 mg.

Omission
Content is missing from the translation that is present in the source.

Example: A paragraph present in the source is missing in the translation

Untranslated
Content that should have been translated has been left untranslated.

Example: A sentence in a Japanese document translated into English is left in Japanese.

Addition
The target text includes text not present in the source.

Example: A translation includes portions of another translation that were inadvertently pasted into the document.

Fluency (apply to source or target text)

Issues related to the form or content of a text, irrespective as to whether it is a translation or not.

Note: If an issue can be detected only by comparing the source and target, it MUST not be categorized as a Fluency issue.

Content: Issues related to content, excluding presentational and/or mechanical issues

Example: A legal notice in German uses the informal du instead of the formal Sie.

Style
The text has stylistic problems, other than those related to language register.

Example: A text uses a confusing style with long sentences that are difficult to understand.

Inconsistency
The text shows internal inconsistency.

Example: The text states that bug reports should be submitted to a mailing list in one place and via an online bug tracker tool in another.

Mechanical: Issues related to the presentation and/or mechanics of the text

Spelling
Issues related to spelling of words

Example: The German word Zustellung is spelled Zustetlugn.

Typography
Issues related to the mechanical presentation of text. This category should be used for any typographical errors other than spelling.

Example: A text uses punctuation incorrectly.

Example: A text has an extraneous hard return in the middle of a paragraph.

Grammar
Issues related to the grammar or syntax of the text, other than spelling and orthography.

Example: An English text reads “The man was in seeing the his wife.”

Locale violation
The text does not adhere to locale-specific conventions or has been written for the wrong locale.

Example: An incorrect format for currency is used for a German text, with a period (.) instead of a comma (,) as a thousands separator.

Unintelligible
The exact nature of the error cannot be determined. Indicates a major break down in fluency.

Example: The following text appears in an English translation of a German automotive manual: “The brake from whe this કુતારો િસ S149235 part numbr,,."

Verity (apply to source or target text)
The text makes statements that contradict the world of the text^[3]

Example: The text states that a feature is present on a certain model of automobile when in fact it is not available.

Completeness
The text is incomplete. (NB: For cases where material from the source language is not present in a translation, Omission should be used instead.)

Example: A process description leaves out key steps needed to complete the process, resulting in an incomplete description of the process.

Legal requirements
A text does not meet legal requirements as set forth in the specifications.

Example: Specifications stated that FCC regulatory notices be replaced by CE notices rather than translated, but they were translated instead, rendering the text legally problematic for use in Europe.

Locale applicability
A text does not apply to the intended locale.

Example: An advertising text translated for Sweden refers to special offers available only in Germany.

Building a Metric from the MQM Core

As mentioned above, it is not expected that most assessment tasks will use all of the core categories. Instead, they represent the most common translation quality assessment issue types and can serve as a common set from which to build metrics. For example, consider a task in which machine translation used for on-demand support purposes for a software package is assessed. In this case it is:

Unlikely that Typography, Inconsistency, Style, Register, Grammar, or Locale Violation will be assessed (since the requirement is to assist users in operating software and the content is for information only).
Omission and Addition will generally not be assessed either since machine translation seldom adds or omits content.
The Verity branch would probably be omitted entirely since it is assumed that the support content is accurate to start with and it is impractical to assess this aspect for an on-demand environment.

The resulting metric, which complies with the MQM Core, might appear as follows, with five issues:

Figure 3. Sample metric for assessing on-demand translation of support materials.

This metric is used to assess the suitability of translations and is interested only in the extent to which the system produces unreadable content (Unintelligible), incorrectly translates content (Mistranslation) or leaves content untranslated (Untranslated), violates terminology requirements as defined in a bilingual glossary (Terminology), and has correct spelling (Spelling). This simple metric might be adequate for determining if the MT system is producing acceptable results. (Note that Accuracy and Fluency are grayed out, indicating that they are not counted separately and that only the five issue types shown are counted.)

It is anticipated that most MQM-compliant metrics would use a small number of issue types. However, because some requirements may dictate additional detail/granularity, MQM contains extensions, as discussed in the next section. Users are encouraged, where possible, to limit issue type selection to the core in order to foster greater interoperability. Where extensions are required, their use should be limited as much as possible and the most abstract level of granularity that meets requirements should be used.

Extensions

As previously noted, extensions provide a way to add capabilities or granularity to MQM. This section describes the extensions to each branch of the MQM issues, including definitions and examples. As some of the content in each extension consists of categories intended to give deeper levels of granularity to existing categories, categories from the core may be repeated in the extensions, but will be rendered in gray.

As with the core structure, extensions will generally not be used in their entirety, but rather a selection may be used. For example, if an assessment task is being undertaken to understand the status of a particular translation with respect to grammatical issues, the Fluency extension may be used and the more detailed subcategories under Grammar used.

Accuracy

The Accuracy extension consists entirely of nine categories that provide additional granularity beyond the core, as shown in the following diagram:

Figure 4. Structure of the Accuracy extension.

Definitions for issues in the Accuracy extension

The additional issues in this extension are defined as follows.

Terminology^[4]

Terminology, normative
A term is translated in a way that does not accord with its normative translation (i.e., a translation mandated in a termbase or other authoritative listing of terms and their translations that was specified for use in the translation) versus general domain usage.

Example: A database of legal terms mandates that the English term contract be translated as Auftrag in German, but the more common Vertrag was used.

Mistranslation

Overly literal
The translation is overly literal.

Example: A Hungarian text contains the phrase Tele van a hocipőd?, which has been translated as “Are your snow boots full?” rather than with the idiomatic meaning of “Feeling overwhelmed?”.

False friend
The translation has incorrectly used a word that is superficially similar to the source word.

Example: The Italian word simpatico has been translated as sympathetic in English.

Should not have been translated
Text was translated that should have been left untranslated

Example: A Japanese translation refers to “Apple Computers” as アップルコンピュータ when the English expression should have been left untranslated.

Date/time
Dates or times do not match between source and target.

Example: A German source text provides the date 09.02.09 (=February 9, 2009) but the English target renders it as September 2, 2009.

Example: An English source text specifies a time of "4:40 PM" but this is rendered as 04:40 (=4:40 AM) in a German translation.

Unit conversion
The target text has not converted numeric values as needed to adjust for different units (e.g., currencies, metric vs. U.S. measurement systems).^[5]

Example: A source text specifies that an item is 25 centimeters (~10 inches) long, but the source states that it is 25 inches (63.5 cm) long.

Number
Numbers are inconsistent between source and target.

Example: The source text specifies that a part is 124 mm long but the target text specifies that it is 147 mm long.

Entity (such as name or place)
Names, places, or other “named entities” do not match

Example: The source text refers to Dublin, Ohio, but the target incorrectly refers to Dublin, Ireland.

Untranslated

Untranslated graphic
Text in a graphic was left untranslated.

Example: Part labels in a graphic were left untranslated even though running text was translated

Fluency Extension

The Fluency extension consists of 38 additional issue types, including both new high-level categories and additional granularity.

Definition for issues in the Fluency extension

The definitions for the extension issues are as follows:

Content issues

Twelve (12) issues are added in the content branch, as shown below:

Figure 5. Structure of the Content branch of the Fluency extension.

These issues are defined as follows:

Variants/slang
The text uses words such as slang that are inappropriate for the intended register.

Example: A refers to dollars as “clams,” when this slang term would be inappropriate.

Style

Company style

Example: Company style states that passive sentences may not be used but the text uses passive sentences.

Style guide
The text violates style defined in a normative specification

Example: Specifications stated that English text was to be formatted according to the Chicago Manual of Style, but the text delivered followed the American Psychological Association style guide.

Inconsistency

Abbreviations
The form of abbreviations is inconsistent in the text.

Example: A text uses both “app.” and “approx.” for approximately.

Images vs. text

Example: A screen shot shows a button with the text “Open other…” but the text referring to the screen shot tells the user to click on the “Open alternative…” button.

Discourse
The discourse structure of the text is inconsistent in a confusing or unclear manner.

Example: The text has a mixture of imperatives, descriptions of actions, and lists within a single process, making it difficult to follow the intended course of action.

Terminological inconsistency
Terminology is used in an inconsistent manner within the text.

Example: The text refers to a component as the brake release lever, brake disengagement lever, manual brake release, and manual disengagement.

NB: This issue should not be used to cases where terminology has been translated incorrectly (Accuracy: Terminology) or cases where the wrong term is used in a source document (Fluency: Content: Monolingual Terminology).

Duplication
Content has been duplicated (e.g., a word or longer portion of text is repeated unintentionally).

Example: A text reads “The man the man whom she saw…”

Monolingual terminology
Terms (as opposed to general-language words) are used incorrectly.^[6]

Example: The term piano action should be used but piano mechanism is used instead.

Normative monolingual terminology
Terms are used in violation of formal guidelines in a terminology database or other terminology resource.^[7]

Example: A text uses the term “Acme TM200" instead of the mandated “Acme TM2000®”.

Ambiguity
The text is ambiguous in its meaning.

Example: A text reads “I cannot recommend this too highly.” (The meaning can be that the speaker cannot make a good recommendation or that it is highly recommended.)

Unclear reference
The text uses relative pronouns or other referential mechanisms that are unclear as to their reference.

Example: A text reads “After completing this, move to the next step,” but there are a number of possible referents for this in the text.

Mechanical issues

Twenty-seven (27) issues are added to the Mechanical branch, as shown below:

Figure 6. Structure of the Mechnical branch of the Fluency extension.

Spelling

Capitalization
Issues related to capitalization

Example: The name John Smith is written as “john smith”

Diacritics
Issues related to the use of diacritics

Example: The Hungarian word bőven (using o with a double acute) is spelled as bõven, using a tilde (˜), which is not found in Hungarian.

Typography

Punctuation
Punctuation is used incorrectly for the locale or style^[8]

Example: An English text uses a semicolon where a comma should be used.

Unpaired quote marks or brackets
One of a pair of quotes or brackets (e.g., a (, [, or { character ) is missing from text.

Example: A text reads “King Ludwig of Bavaria (1845–1996 was deposed on account of his supposed madness.”

Grammar

Morphology (word form)
There is a problem in the internal construction of a word

Example: An English text has comed instead of came.

Part of speech
A word is the wrong part of speech

Example: A text reads “Read these instructions careful” instead of “Read these instructions carefully.”

Agreement
Two or more words do not agree with respect to case, number, person, or other grammatical features

Example: A text reads “They was expecting a report.”

Word order
The word order is incorrect

Example: A German text reads “Er hat gesehen den Mann” instead of “Er hat den Mann gesehen.”

Function words
A function word (e.g., a preposition, “helping verb”, article, determiner) is used incorrectly.

Example: A text reads “Check the part number as given in the screen” instead of “…on the screen”.

Example: A text reads “The graphic is then copied into an internal memory” instead of “The graphic is copied to internal memory.”

Locale violation

Date format
A text uses a date format inappropriate for its locale.

Example: An English text has “2012-06-07” instead of the expected “06/07/2012.”

Time format
A text uses a time format inappropriate for its locale.

Example: A text written for the U.S. uses a 24-hour time notation rather than AM/PM time.

Measurement format
A text uses a measurement format inappropriate for its locale.

Example: A text in France uses feet and inches and Fahrenheit temperatures.

Number format
A text uses a number format inappropriate for its locale.

Example: A German text has 123,456 instead of the locale-appropriate 123.456.

Quote marks type
A text uses quote marks inappropriate for its locale.

Example: A French text should use guillemets («») but instead uses German-style quotes („”)

National language standard
A text violates national language standards.

Example: A French advertising text uses anglicisms that are forbidden for print texts by the Academie française specifications.

Character encoding
Characters are garbled due to incorrect application of an encoding.

Example: A text document in UTF-8 encoding is opened as ISO Latin-1, resulting in all “upper ASCII” characters being garbled.

Nonallowed characters
The text includes characters that are not allowed.^[9]

Example: A text may not include colons or forward- or back-slashes, which might cause confusion with path names on some computer systems, but it contains theses characters.

Pattern problem
The text contains a pattern (e.g., text that matches a regular expression) that is not allowed.

Example: The regular expression ["'”’][,\.;] (i.e., a quote mark followed by a comma, full stop, or semicolon) is defined as not allowed for a project but a text contains the string ”, (closing quote followed by a comma).

Sorting
A list is not in the appropriately collated sequence.^[10]

Example: A listing of items should be in alphabetical order but appears in a random order instead.

Corpus conformance^[11]
The content is deemed to have a level of conformance to a reference corpus. The non-conformance type reflects the degree to which the text conforms to a reference corpus given an algorithm that combines several classes of error type to produce an aggregate rating.

Example: A text reading “The harbour connected which to printer is busy or configared not properly” is flagged by a language analysis tool as suspect based on its lack of conformance to an existing corpus.

Broken link/cross-reference
A link or cross reference points to an incorrect or nonexistent location

Example: An HTML document has an href that points to a file that does not exist.

document-internal^[12]
A link or cross reference points to an incorrect or nonexistent location within the same document within which it occurs.

Example: An internal link refers to the location “#section5” but there is no anchor “section5” in the document.

document-external
A link or cross reference points to an incorrect or nonexistent location outside of the same document within which it occurs

Example: A link in an HTML document points to a U.S. government URL that has moved and no longer exists.

Index/TOC
Issues related to an index or Table of Contents (TOC).

Example: A Table of Contents is missing items that should be included.

Page references
An index/TOC refers to incorrect page numbers

Example: A table of contents refers to page numbers from the source document that do not apply to the translated text.

Index/TOC format
An index/TOC is formatted incorrectly

Example: A Table of Content should be formatted with variable (hierarchical) indenting and tab leader characters, but is instead displayed as a “run-in” list.

Missing/incorrect item
Items in an index/TOC are incorrect or missing

Example: A chapter heading is not listed in a Table of Contents.

Verity extension

The Verity extension consists of two categories that extend the granularity of one category, as shown below:

Figure 7. Structure of the Verity extension

Definitions for issues in the Verity extension

Completeness

Lists. A list is missing necessary items.

Example: A list of items included in a retail package omits a crucial component.

Procedures. A procedure is missing necessary steps.

Example: A document describing a procedure to restart a diesel generator omits a crucial step that must be completed prior to performing additional steps.

Design extension

The Design extension comprises the entire Design branch of MQM. It applies only in cases where formatting is significant. It consists of 36 issue types, in a hierarchy, as shown below:

Figure 8. Structure of the Design extension.

Note that for computational purposes in generating MQM scores, Design is generally counted with Fluency, although individual issues may align more closely with Accuracy in concept.

Definition for issues in the Design extension

Overall design (layout)
Issues related to overall layout and design (versus local formatting)

Color
Colors are used incorrectly

Example: Headings should be blue but are green instead.

Global font choice
The overall font chosen is incorrect or inappropriate,

Example: A English source text uses a normal-weight serif font for body text but the Japanese translation uses a heavy-weight “gothic” (roughly, sans-serif) font appropriate for headlines only.

Footnote/endnote format
Footnotes or endnotes are placed inappropriately or use incorrect in-text symbols

Example: Specifications state that endnotes should be used with roman numerals but footnotes were used with in-text symbols (*, †, ‡, etc.).

Headers and footers
Headers or footers are formatted incorrectly

Example: Headers should appear on every page but have been omitted on odd-numbered pages.

Margins
Text margins are incorrect.

Example: Specifications called for 4 cm inside margins, but 2.5 cm margins were used instead.

Widows/orphans
The text has widows or orphans (single or short lines of text that appear on a separate page from the rest of a paragraph).

Example: Specifications state that at least two lines of a paragraph must appear on a page (if the paragraph is more than one line), but a single line starts a page while two appear on the previous page.

Page breaks, inappropriate
Page breaks appear in inappropriate locations.

Example: There is a page break between a figure and its caption.

Local formatting
Issues related to local formatting (rather than to overall layout concerns)

Text alignment
A portion of a text is aligned inappropriately.

Example: A heading should be left-aligned but was centered instead.

Paragraph indentation
A paragraph is indented improperly.

Example: The first line of body paragraphs should be indented 4 mm, but some paragraphs were indented 25 mm instead.

Font
Issues related to local font usage (i.e., font choices that impact a span of content rather than the global choice of the document).

Example: Warning texts are set in sans-serif, but one of them appears in a serif font.

Example: A portion of Japanese text is set with an obliqued face (corresponding to italics in the source text) when dot accents should have been used with a non-oblique face.

Bold/italic
Bold or italics are used incorrectly.

Example: A book title should have been italicized, but the italics were omitted.

Wrong size
The font size is incorrect

Example: A legal notice should be set in a 9 pt size, but was instead set in 7 pt.

Font, single/double-width (CJK only)
Single-width characters are used when double-width are intended, or vice versa.

Example: A Japanese text includes カタカナ (full-width kana) when specifications required ｶﾀｶﾅ (half-width kana) instead, due to a limited display size.

Kerning
Kerning (inter-character spacing) is wrong.

Example: The letters T and A in the word TAMPA are spaced too close together and collide.

Leading
Leading (spacing between lines of text) is off

Example: A translated Japanese text has set lines too close together, making the text difficult to read.

Markup
Issues related to “markup” (codes used to represent structure or formatting of text, also known as “tags”).^[13]

Inconsistent markup
Markup elements are inconsistent between the source and target

Example: A target text has a set of tags for bold face in the same location where the source has tags for italics.

Misplaced
Markup is present but misplaced.

Example: A segment has three sets of paired formatting tags at the end, after the final full stop (.).

Added
The target text has markup added with no corresponding markup in the source.

Example: A source segment has no formatting tags, but the target has a set of italic tags.

Missing
Markup in the source is missing in the target.

Example: A source segment has a set of italic tags, but the target text does not have any tags.

Questionable
Markup is present that appears malformed or inappropriate for its context.

Example: A text has opening tags but no closing tags for formatting.

Whitespace
Refers to issues with regard to whitespace

Example: A document uses a string of space characters instead of tabs

Example: Extra spaces are added at the start of a string

Graphics and tables
Issues related to the formatting of graphics and tables.^[14]

Example: A graphic is garbled and the wrong version is shown

Position
A graphic or table is positioned incorrectly.

Example: A text refers to Figure 1, but Figure 1 appears six pages after the point where it was referred to.

Missing
A graphic is missing.

Example: An HTML file has an <img> tag that refers to the wrong location, so no graphic is shown.

Call-outs and caption
There are issues with call-outs (text within a graphic that identifies parts) or captions.

Example: During localization the location of numbers used for call-outs has been shifted and the call-outs are no longer usable.

Truncation/text expansion
The target text has insufficient room to display the translated text according to specifications.

Example: The German translation of an English string in a user interface runs off the edge of a dialogue box and cannot be read.

Length
There is a significant discrepancy between the source and the target text lengths.^[15]

Example: An English sentence is 253 characters long but its German translation is 51 characters long.

Internationalization extension

The Internationalization extension presently consists of a single issue type, Internationalization, which is used for any internationalization errors. This branch may be expanded in the future.

Internationalization
There is a problem related to the internationalization of content.

Example: A document assumes that all addresses use postal codes conforming to the U.S. “zip+four” convention and includes a verification step for postal codes that does not allow for non-U.S. codes.

Compatibility extension

The Compatibility extension contains items which may be used for compatibility with legacy metrics even though they would otherwise not be included in MQM. Most of these issue types are taken from the LISA QA Model documentation.

Definitions are not included for these issues.

These categories should not be included in MQM-compatible metrics unless used to represent legacy metrics in MQM-compatible form. They relate either to process or project requirements (which are not covered by MQM and would otherwise be out of scope), address functional issues (such as application compatibility) that are unrelated to linguistic quality, or address

Application compatibility
Bill of materials/runlist
Book-building sequence
Covers
Deadline
Delivery
Does not adhere to specifications
Embedded text
File format
Functional
Output device
Printing
Release guide
Spines
Style, publishing standards
Terminology, contextually inappropriate

These categories should be used for compatibility with legacy metrics (particularly the LISA QA Model) only and their general use is discouraged. These categories are not shown in the graphical overviews in this document.

Other

The Other category is used as a catch-all for any issues not adequately covered by the MQM core or extensions. This category should be used only if it is impossible to assign an issue to an existing category with sufficient granularity.

If such issues are systematically encountered, please inform info@qt21.eu for consideration for inclusion in updates of MQM issue types.

Additional extensions

Additional extensions can be defined by users and may be added as official extensions in time. Additional extensions should not conflict with the core or existing extensions or replace any existing categories. They may add granularity to MQM or add new issue types not anticipated in MQM.

Guidelines for selecting issue types

In many cases multiple issue types may describe a single actual issue/error in the text. For example, if a term is translated incorrectly, this is an example of both the Terminology and Mistranslation issue types; if a date is incorrect in the target text it is simultaneously a Mistranslation, a Date/Time, and and Entity (such as a name or place) issue. In such cases issues should not be counted multiple times but instead one issue type should be selected based on the following principles:

The priority for assigning issues to particular branches from the current main and extension branches is as follows: (1) Accuracy, (2) Fluency, (3) Verity, (4) Design. In other words, if an issue can be assigned to more than one main branch, it should be assigned to the lowest-numbered one possible in the list. For example, if a metric checks both Terminology and Inconsistency, and an issue could apply to both, then Terminology would be used because it is in the Accuracy branch.
Internationalization and Compatibility should be used only if the assessor has knowledge of using them. In general, a translation error should be assigned to one of the four branches listed above and only engineering errors should be assigned to Internationalization.
If multiple issue types apply at the same level in the hierarchy, the one that appears first in the list should be used. For example, if both Terminology and Mistranslation apply, Terminology will be used because it appears first in the hierarchical listing.
If an issue cannot be readily assigned to any specific issue at one level, it should be assigned to the next higher one. For example, if a problem cannot readily be assigned to any of the subtypes of Mistranslation, then Mistranslation itself should be used.
If multiple issue types apply from within a single branch (i.e., one of the issue types is a specific case of the other one, such as in the case of Overly literal and Mistranslation) then the more specific type will apply (i.e., Overly literal will be used instead of Mistranslation).

Log of changes

From version 2.5

Added preliminary “Scope” section
Added the “Related documents” section
Numerous minor textual corrections

From version 2.4.5

Added the “Function words” issue type
Moved a few pieces around
Small textual corrections

[1] Compatibility is not shown in any of the diagrams shown in this document. It is used for a number of legacy issues described in other systems (notably the LISA QA Model) that address issues not relate to product quality, but issue types contained in it are deprecated for general use.

[2] These labels can be used as issue types in their own right, but are not counted here.

[3] Note that the “world of the text” may or may not be the real world. In the case of fiction, propaganda, or marketing, claims may arise that are not, in fact, true to the real world, but which are true to the world assumed by the text.

[4] The daughter categories under Terminology should be used only when it is necessary to identify the precise nature of the terminology error, e.g., if it is important to know that a process did not follow a termbase versus knowing that the process did not know common terminology for the domain.

[5] If only the measurement system, rather than the actual value, is wrong, an issue should be treated under locale convention.

[6] In most cases Monolingual terminology will apply to source texts. If a formal bilingual glossary is specified for use, terms in the target text that do not match the translations specified in the glossary MUST be classified as Terminology since a bilingual resource was specified in the project requirements. If in doubt, use Terminology.

[7] This category is to be used only to indicate terminology problems related to a formalized, normative term list when they must be distinguished from errors that do not relate to normative lists.

[8] For cases of systematic uses of quote mark formats from the wrong locale, use Quote marks format under Locale convention instead.

[9] The determination of what characters are or are not allowed is workflow specific and cannot be stated as a general rule. This category should be used only when specific characters are forbidden.

[10] For issues related to the collation of an index or table of contents, use Index/TOC collation instead.

[11] This category is included for compatibility with the ITS 2.0 specification and readers should consult the ITS 2.0 specification (http://www.w3.org/TR/its20/#lqissue-typevalues) for more information.

[12] This category and document-external broken link/cross-reference should be used only if it is important to identify the nature of the broken link/cross-reference.

[13] Given the complexity of the markup, many markup issues will need to be verified manually since changes may be deliberate and needed.

[14] Issues related to fonts and text in graphics should be handled in other categories, as appropriate.

[15] NB: Significance in this context depends on the languages involved and many other factors and no general guideline as to what constitutes significance is possible.

Background

Scope

Issue vs. error

According to Specifications…

Scoring

Evolution and Sources

Related Documents

High-Level Structure

MQM Core

MQM Core Issue Types

Building a Metric from the MQM Core

Extensions

Accuracy

Definitions for issues in the Accuracy extension

Fluency Extension

Definition for issues in the Fluency extension

Content issues

Mechanical issues

Verity extension

Definitions for issues in the Verity extension

Design extension

Definition for issues in the Design extension

Internationalization extension

Compatibility extension

Other

Additional extensions

Guidelines for selecting issue types

Log of changes

From version 2.5

From version 2.4.5