du move while traversing

Version:

coreutils-8.5(introduced in 5.1)

Bug Link:

http://www.mail-archive.com/coreutils@gnu.org/msg00772.html

Symptom:

du would abort with a failed assertion when two conditions are met:                                                                                        

part of the hierarchy being traversed is moved to a higher level in the directory tree, and there is at least one more command line directory argument following the one containing the moved sub-tree.

Failure type:

early termination

Is there log messages (default verbosity level) printed?

Yes

How it is diagnosed (reproduced or source analysis)?

reproduced!

How to reproduce?

$ du -sk ~/*


to see where all the space is going.  The traversal almost certainly
was in ~/src, and being bored, I suspended it to examine that directory:
^Z
[1]+  Stopped                 du -sk ~/*


cd ~/src && du -sk *.  ~/src/netbsd was
very big, so I moved it (using chmod +w first) to ~/.cache on the same
filesystem for future reference.  And now time to resume the toplevel
search:

$ fg
du -sk ~/*
du:
 fts_read failed: No such file or directory
du: du.c:583: process_file: Assertion `level == prev_level - 1' failed.
Aborted (core dumped)

Root cause:

a race condition

--- a/src/du.c
+++ b/src/du.c
@@ -63,8 +63,11 @@ extern bool fts_debug;
/* A set of dev/ino pairs.  */
static struct di_set *di_set;

-/* Define a class for collecting directory information. */
+/* Keep track of the preceding "level" (depth in hierarchy)
+   from one call of process_file to the next.  */
+static size_t prev_level;

+/* Define a class for collecting directory information. */
struct duinfo
{
  /* Size of files in directory.  */
@@ -399,7 +402,6 @@ process_file (FTS *fts, FTSENT *ent)
  struct duinfo dui;
  struct duinfo dui_to_print;
  size_t level;
-  static size_t prev_level;
  static size_t n_alloc;
  /* First element of the structure contains:
     The sum of the st_size values of all entries in the single directory
@@ -582,10 +584,15 @@ du_files (char **files, int bit_flags)
            {
              if (errno != 0)
                {
-                  /* FIXME: try to give a better message  */
-                  error (0, errno, _("fts_read failed"));
+                  error (0, errno, _("%s: fts_read failed"),
+                         quotearg_colon (fts->fts_path));
                  ok = false;
                }
+
+              /* When exiting this loop early, be careful to reset the
+                 global, prev_level, used in process_file.  Otherwise, its
+                 (level == prev_level - 1) assertion could fail.  */
+              prev_level = 0;
              break;
            }
          FTS_CROSS_CHECK (fts);

The first error message:

          ent = fts_read (fts);

          if (ent == NULL)                                                                                                                                  

            {  

              if (errno != 0)

                {  

                  /* FIXME: try to give a better message  */

                  error (0, errno, _("fts_read failed"));

                  ok = false;

                }  

              break;

            }

Can Errlog automatically print the error message?

Yes. It is the system call return value pattern.