[coreutils] tr “[:upper:][:lower:]” “ [:upper]”

Version:

coreutils-6.9.90 (fixed in 8.6)

How to reproduce?

In the all locales, for some invalid input specification, tr can not diagnose.

Reproduce this case by command:

tr "[:upper:][:lower:]" "[:upper:]"

If the input is like tr "[:upper:][:lower:]" "[:upper:]" in version 6.9.90. Nothing error is reported. Instead, it simply translate every character into “Z”.

Background:

What is tr?

‘tr’ is a translation program. It takes input from standard input, and output the translated string to the standard output. For example:

tr a b

will translate all the occurrences of “a” to “b”.

It also has some standardized arguments. [:upper:] and [:lower:] are two of them. So for example:

tr “[:upper:]” “[:lower:]”

will translate all the upper cases into lower cases.

Symptom:

Incorrect results by accepting invalid inputs.

If the input is like tr "[:upper:][:lower:]" "[:upper:]" in version 6.9.90. Nothing error is reported. Instead, it simply translate every character into “Z”.

This is an invalid input, and the correct behavior of tr should to reject it.

The correct reaction (version 8.6) should report the following:

../coreutils-8.6/src/tr: when translating with string1 longer than string2,

the latter string must not end with a character class

Root cause:

static void

string2_extend (const struct Spec_list *s1, struct Spec_list *s2)

{

 … ...

 switch (p->type)

        {

        case RE_NORMAL_CHAR:

         char_to_repeat = p->u.normal_char;

         break;

        case RE_RANGE:

         char_to_repeat = p->u.range.last_char;

         break;

        case RE_CHAR_CLASS:

     /* Originally developers were testing on BSD systems, so

        they thought this case is OK. but actually it is not OK on

        majority systems. So in

        the patch they simply rejects the input. */

+          /* Note BSD allows extending of classes in string2.  For example:

+       tr '[:upper:]0-9' '[:lower:]'

+       That's not portable however, contradicts POSIX and is dependent

+       on your collating sequence.  */

+          error (EXIT_FAILURE, 0,

+        _("when translating with string1 longer than string2,\n\

the latter string must not end with a character class"));

+        abort (); /* inform gcc that the above use of error never returns. */

-          for (i = N_CHARS - 1; i >= 0; i--)

-        if (is_char_class_member (p->u.char_class, i))

-        break;

-          assert (i >= 0);        ← this is essentially a default check, but it’s not turned on

-          char_to_repeat = i;

         break;

        .. .. ..

        default:

         abort ();

         break;

        }

 append_repeated_char (s2, char_to_repeat, s1->length - s2->length);

 s2->length = s1->length;

}

Is there Error Message?

No

Can Errlog insert an error message?

No.