[hadoop]MAPREDUCE-3125 Report

1. Symptom

As you can see, the first task’s status was still “Running” (in fact, it is killed), but the progress is already 100%.

If we click on that task, we have the following:

1.1 Severity

Critical

1.2 Was there exception thrown?

Sure. The job failed with “ShuffleError”. It was because of 3124 (that the AM cannot load the native java classes). But this is not the exception for this particular error though.

1.2.1 Were there multiple exceptions?

No.

2. How to reproduce this failure

2.0 Version

0.23.0

2.1 Configuration

Standard

2.2 Reproduction procedure

1. start a MR job:

 hadoop jar ~/research/hadoop/hadoop-2.0.2-alpha-src/hadoop-dist/target/hadoop-2.0.2-alpha/share/hadoop/mapreduce/hadoop-*-examples.jar randomwriter rand

2. kill the job

3. check the web interface

2.2.1 Timing order

Kill a running job will do.

2.2.2 Events order externally controllable?

Sure.

2.3 Can the logs tell how to reproduce the failure?

Yes. The unique event is kill a job:

./container_1375285695285_0001_01_000001/syslog:2013-07-31 11:53:21,982 ERROR [TaskCleaner Event Handler] org.apache.hadoop.mapreduce.v2.app.taskclean.TaskCleanerImpl: Returning, interrupted : java.lang.InterruptedException

2.4 How many machines needed?

1. (AM + RM)

3. Diagnosis procedure

Very straightforward once the symptom is observed.

3.1 Detailed Symptom (where you start)

The screen shot above.

3.2 Backward inference

The key is that the job is killed, yet still shows the progress. It is not hard to locate the code snippet in AM that returns the progress:

TaskImpl.java:

  public float getProgress() {

    readLock.lock();

    try {

      TaskAttempt bestAttempt = selectBestAttempt();

      if (bestAttempt == null) {

        return 0;

      }

      return bestAttempt.getProgress();

    } finally {

      readLock.unlock();

    }

  }

 //select the nextAttemptNumber with best progress

  // always called inside the Read Lock

  private TaskAttempt selectBestAttempt() {

    float progress = 0f;

    TaskAttempt result = null;

    for (TaskAttempt at : attempts.values()) {

       if (result == null) {

        result = at; //The first time around

      }

      // calculate the best progress

      if (at.getProgress() > progress) {

        result = at;

        progress = at.getProgress();

      }

    }

    return result;

  }

TaskAttemptImpl.java:

  public float getProgress() {

    readLock.lock();

    try {

      return reportedStatus.progress;

    } finally {

      readLock.unlock();

    }

  }

The problem: in the highlighted loop above, it doesn’t consider if the task is already dead!

4. Root cause

--- hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
+++ hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
@@ -441,10 +441,20 @@ public abstract class TaskImpl implements Task, EventHandler<TaskEvent> {
    float progress = 0f;
    TaskAttempt result = null;
    for (TaskAttempt at : attempts.values()) {
+      switch (at.getState()) {
+      
+      // ignore all failed task attempts
+      case FAIL_CONTAINER_CLEANUP:
+      case FAIL_TASK_CLEANUP:
+      case FAILED:
+      case KILL_CONTAINER_CLEANUP:
+      case KILL_TASK_CLEANUP:
+      case KILLED:
+        continue;      
+      }      
      if (result == null) {
        result = at; //The first time around
      }
-      //TODO: consider the nextAttemptNumber only if it is not failed/killed ?
       // calculate the best progress
      if (at.getProgress() > progress) {
        result = at;

4.1 Category:

Incorrect exception handling. Did not handle the job kill.

5. Fix

See above.

5.1 How?

handle the killed jobs.