From: bobg@... (Bob Glickstein)

Date: Thu, 7 Jan 1993 17:53:42 -0800

To: rms@...

Subject: Gzip

Cc: friedman@..., jloup@...

I am writing this letter to you on the advice of Jean-Loup Gailly, author of gzip (who is in the Cc list).  I have corresponded with him on the subject of the default filename suffix used by gzip. I believe using ".z" is wrong for several reasons, technical and political.  I am writing this as a concerned net.citizen with a strong personal desire to see the GNU project achieve as near universal acceptance as possible.

There are many reasons for wanting to be able to distinguish file types based solely on filenames.  Auto-selecting a mode when visiting a file in Emacs is one example.  Performing shell-level operations on related groups of files using wildcards ("*.z") is another.

With the desirability of distinctive filename suffixes in mind, consider the problems posed by compress, pack, and MS-DOS.

Compress, as you know, uses a default suffix of ".Z".  This would not be in conflict with gzip-generated files except on monocase filesystems like MS-DOS.  In that case there is no way to distinguish a ".z" file from a ".Z" file.  Perhaps you are not concerned with addressing this particular problem, since all DOS users are lusers. But consider that many data files over the course of their lifetimes pass from a Unix system, through a DOS system, and back to Unix. The name of a ".z" file could be confusingly mangled to ".Z", or vice versa, in the process.

Next consider the Huffman-coding compressor, "pack", which also uses the ".z" suffix.  In my first communication with Mssr. Gailly, he advised me that since the use of pack is not widespread, the name collision is not considered a problem.  Although I don't know how widespread the use of "pack" is in general, I do know that it is fairly integral to the operation of my SGI Indigo, in that all of the supplied catman pages are "packed" and named "ls.1.z" and so on.  Even if this weren't the case, deliberately choosing a filename suffix that is already in non-trivial use, especially by software which serves a similar purpose but which is incompatible, seems almost criminally obfuscatory.  (After my first exchange with Mssr. Gailly, he acknowledged that the use of pack is more widespread than he originally believed.)

When I raised this point with Mssr. Gailly, he proposed as a solution building pack/unpack functionality into gzip/gunzip.  Thus gzip could deal with a ".z" file no matter which program created it. This approach goes part of the way toward solving the problem, but not all the way.  It is "gzip-centric," in the sense that pack will always write ".z" files which gzip can always read, but gzip will always write ".z" files which unpack can never read.  Not only does it assault my sense of symmetry, but it will lead to even more confusion as users attempt to determine which .z files were created by which program.  After all, not every site will have gunzip for dealing with all .z files (our hopes to the contrary notwithstanding).

If it is the FSF's hope that gzip will achieve wide acceptance and displace compress as the long-lived, high-performance, widely-available compressor of choice, it is off to a bad start in deliberately overloading filename suffixes.  GNU software has a good and ever-improving reputation for clarity, correctness, and superiority to the software it replaces.  It is always backward-compatible with its predecessors, and corrects the bad design decisions that were present in the original.  In short, GNU is, so far in my experience, *completely* brain-damage-free, which is utterly remarkable; on the other hand, if GNU software starts confounding my filespace, the armor will lose some of its shine.

This is not the way to sell recalcitrant users on what you hope will be a new standard.

Why resist so strongly choosing a new filename suffix, especially when promoting a new file format?  The need to choose a new suffix would almost seem to be a foregone conclusion.  If I were designing a new encoding for pixmaps, I certainly wouldn't suffix the image files with ".gif" even if it does stand for "Glickstein's Image Format".

I believe that sticking with a ".z" suffix will diminish the acceptance of gzip and possibly tarnish the perception of GNU good sense.  Please let me know your thoughts.


Bob Glickstein

Z-Code Software Corp.