HDFS-3415 Report
https://issues.apache.org/jira/browse/HDFS-3415
Namenode fails to start. In the NN’s log, we have the exception:
2013-07-10 16:38:02,304 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.lang.NullPointerException
at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageFile(NNStorage.java:686)
at org.apache.hadoop.hdfs.server.namenode.FSImagePreTransactionalStorageInspector.getEditsInStorageDir(FSImagePreTransactionalStorageInspector.java:244)
at org.apache.hadoop.hdfs.server.namenode.FSImagePreTransactionalStorageInspector.getLatestEditsFiles(FSImagePreTransactionalStorageInspector.java:262)
at org.apache.hadoop.hdfs.server.namenode.FSImagePreTransactionalStorageInspector.getEditLogStreams(FSImagePreTransactionalStorageInspector.java:279)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:596)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:247)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:498)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:390)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:354)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:369)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:403)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:551)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1114)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1170)
Major.
Yes.
2.0.0-alpha
One node, only NN is needed!
The only non-traditional setting: hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/ding/research/hadoop/hadoop-2.0.0-alpha-src/hadoop-dist/target/hadoop-2.0.0-alpha/dfs/name,file:/home/ding/research/hadoop/hadoop-2.0.0-alpha-src/hadoop-dist/target/hadoop-2.0.0-alpha/dfs/name1</value>
<final>true</final>
</property>
$ hdfs namenode -format
$ hadoop-daemon.sh start namenode
$ hadoop-daemon.sh stop namenode
Goto dfs/name1/current/VERSION
→ change “-40” -> “123”
$ hadoop-daemon.sh start namenode
namenode fails to start.
Hard. This step:
Goto dfs/name1/current/VERSION
→ change “-40” -> “123”
is hard to be reflected.
But there are clues:
2013-07-10 16:38:02,295 WARN org.apache.hadoop.hdfs.server.namenode.FSImage: Ignoring one or more storage directories with old layouts
--- This log msg shouldn’t be printed in non-error executions.
NNStorage.java:
if (LayoutVersion.supports(Feature.TXID_BASED_LAYOUT, minLayoutVersion)) {
FSImage.LOG.warn("Ding: Creating FSImageTransactionalStorageInspector");
inspector = new FSImageTransactionalStorageInspector();
if (!LayoutVersion.supports(Feature.TXID_BASED_LAYOUT, maxLayoutVersion)) {
--- maxLayoutVersion in the buggy case is “123”. If VERSION is not modified, it should be “-40”, and this msg shouldn’t be printed!
FSImage.LOG.warn("Ignoring one or more storage directories with old layouts");
}
} else {
FSImage.LOG.warn("Ding: Creating FSImagePreTransactionalStorageInspector");
inspector = new FSImagePreTransactionalStorageInspector();
}
One is enough.
2013-07-10 16:38:02,304 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.lang.NullPointerException
at org.apache.hadoop.hdfs.server.namenode.NNStorage.getStorageFile(NNStorage.java:686)
at org.apache.hadoop.hdfs.server.namenode.FSImagePreTransactionalStorageInspector.getEditsInStorageDir(FSImagePreTransactionalStorageInspector.java:244)
at org.apache.hadoop.hdfs.server.namenode.FSImagePreTransactionalStorageInspector.getLatestEditsFiles(FSImagePreTransactionalStorageInspector.java:262)
at org.apache.hadoop.hdfs.server.namenode.FSImagePreTransactionalStorageInspector.getEditLogStreams(FSImagePreTransactionalStorageInspector.java:279)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:596)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:247)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:498)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:390)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:354)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:369)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:403)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:570)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:551)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1114)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1170)
We can locate the code:
/**
* Get a storage file for one of the files that doesn't need a txid associated
* (e.g version, seen_txid)
*/
static File getStorageFile(StorageDirectory sd, NameNodeFile type) {
return new File(sd.getCurrentDir(), type.getName());
}
--- Challenge: we don’t know which is null: StorageDirectory, or NameNodeFile type?
If we go up along the call-stack, we will arrive at this function:
/**
* Choose latest image from one of the directories,
* load it and merge with the edits from that directory.
*
* Saving and loading fsimage should never trigger symlink resolution.
* The paths that are persisted do not have *intermediate* symlinks
* because intermediate symlinks are resolved at the time files,
* directories, and symlinks are created. All paths accessed while
* loading or saving fsimage should therefore only see symlinks as
* the final path component, and the functions called below do not
* resolve symlinks that are the final path component.
*
* @return whether the image should be saved
* @throws IOException
*/
boolean loadFSImage(FSNamesystem target, MetaRecoveryContext recovery)
throws IOException {
.. ..
if (LayoutVersion.supports(Feature.TXID_BASED_LAYOUT,
getLayoutVersion())) {
// If we're open for write, we're either non-HA or we're the active NN, so
// we better be able to load all the edits. If we're the standby NN, it's
// OK to not be able to read all of edits right now.
long toAtLeastTxId = editLog.isOpenForWrite() ? inspector.getMaxSeenTxId() : 0;
editStreams = editLog.selectInputStreams(imageFile.getCheckpointTxId() + 1,
toAtLeastTxId, false);
} else {
editStreams = FSImagePreTransactionalStorageInspector
.getEditLogStreams(storage);
}
--- So the layout version (123) is not supported, that’s why a FSImagePreTransactionalStorageInspector is used (which is wrong).
In fact, this layout version (123) should be rejected... The fix is simply reject it when the first time it is detected below:
/**
* Iterate over all of the storage dirs, reading their contents to determine
* their layout versions. Returns an FSImageStorageInspector which has
* inspected each directory.
*
* <b>Note:</b> this can mutate the storage info fields (ctime, version, etc).
* @throws IOException if no valid storage dirs are found
*/
FSImageStorageInspector readAndInspectDirs()
throws IOException {
int minLayoutVersion = Integer.MAX_VALUE; // the newest
int maxLayoutVersion = Integer.MIN_VALUE; // the oldest
// First determine what range of layout versions we're going to inspect
for (Iterator<StorageDirectory> it = dirIterator();
it.hasNext();) {
StorageDirectory sd = it.next();
if (!sd.getVersionFile().exists()) {
FSImage.LOG.warn("Storage directory " + sd + " contains no VERSION file. Skipping...");
continue;
}
readProperties(sd); // sets layoutVersion
minLayoutVersion = Math.min(minLayoutVersion, getLayoutVersion());
maxLayoutVersion = Math.max(maxLayoutVersion, getLayoutVersion());
}
if (minLayoutVersion > maxLayoutVersion) {
throw new IOException("No storage directories contained VERSION information");
}
assert minLayoutVersion <= maxLayoutVersion;
// If we have any storage directories with the new layout version
// (ie edits_<txnid>) then use the new inspector, which will ignore
// the old format dirs.
FSImageStorageInspector inspector;
if (LayoutVersion.supports(Feature.TXID_BASED_LAYOUT, minLayoutVersion)) {
inspector = new FSImageTransactionalStorageInspector();
if (!LayoutVersion.supports(Feature.TXID_BASED_LAYOUT, maxLayoutVersion)) {
FSImage.LOG.warn("Ignoring one or more storage directories with old layouts");
--- invalid case. An exception should have been thrown and rejected early!!
}
} else {
FSImage.LOG.warn("Ding: Creating FSImagePreTransactionalStorageInspector");
inspector = new FSImagePreTransactionalStorageInspector();
}
inspectStorageDirs(inspector);
return inspector;
}
Patch explained:
diff --git hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
index d5061b7..01c951a 100644
--- hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
+++ hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
@@ -31,6 +31,7 @@ import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
+import java.util.Map;
import java.util.Properties;
import java.util.UUID;
import java.util.concurrent.CopyOnWriteArrayList;
@@ -58,6 +59,7 @@ import org.apache.hadoop.net.DNS;
import com.google.common.base.Preconditions;
import com.google.common.annotations.VisibleForTesting;
import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
/**
* NNStorage is responsible for management of the StorageDirectories used by
@@ -1076,13 +1078,14 @@ public class NNStorage extends Storage implements Closeable {
* inspected each directory.
*
* <b>Note:</b> this can mutate the storage info fields (ctime, version, etc).
- * @throws IOException if no valid storage dirs are found
+ * @throws IOException if no valid storage dirs are found or no valid layout version
*/
FSImageStorageInspector readAndInspectDirs()
throws IOException {
- int minLayoutVersion = Integer.MAX_VALUE; // the newest
- int maxLayoutVersion = Integer.MIN_VALUE; // the oldest
-
+ Integer layoutVersion = null;
+ boolean multipleLV = false;
+ StringBuilder layoutVersions = new StringBuilder();
+
// First determine what range of layout versions we're going to inspect
for (Iterator<StorageDirectory> it = dirIterator();
it.hasNext();) {
@@ -1092,24 +1095,29 @@ public class NNStorage extends Storage implements Closeable {
continue;
}
readProperties(sd); // sets layoutVersion
- minLayoutVersion = Math.min(minLayoutVersion, getLayoutVersion());
- maxLayoutVersion = Math.max(maxLayoutVersion, getLayoutVersion());
+ int lv = getLayoutVersion();
+ if (layoutVersion == null) {
+ layoutVersion = Integer.valueOf(lv);
+ } else if (!layoutVersion.equals(lv)) {
+ multipleLV = true;
+ }
+ layoutVersions.append("(").append(sd.getRoot()).append(", ").append(lv).append(") ");
}
- if (minLayoutVersion > maxLayoutVersion) {
+ if (layoutVersion == null) {
throw new IOException("No storage directories contained VERSION information");
}
- assert minLayoutVersion <= maxLayoutVersion;
-
- // If we have any storage directories with the new layout version
+ if (multipleLV) {
+ throw new IOException(
+ "Storage directories containe multiple layout versions: "
+ + layoutVersions);
+ }
+ // If the storage directories are with the new layout version
// (ie edits_<txnid>) then use the new inspector, which will ignore
// the old format dirs.
FSImageStorageInspector inspector;
- if (LayoutVersion.supports(Feature.TXID_BASED_LAYOUT, minLayoutVersion)) {
+ if (LayoutVersion.supports(Feature.TXID_BASED_LAYOUT, getLayoutVersion())) {
inspector = new FSImageTransactionalStorageInspector();
- if (!LayoutVersion.supports(Feature.TXID_BASED_LAYOUT, maxLayoutVersion)) {
- FSImage.LOG.warn("Ignoring one or more storage directories with old layouts");
- }
} else {
inspector = new FSImagePreTransactionalStorageInspector();
}
2013-07-10 17:52:26,391 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2013-07-10 17:52:26,392 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2013-07-10 17:52:26,392 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2013-07-10 17:52:26,393 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.IOException: Storage directories contain multiple layout versions: (/home/ding/research/hadoop/hadoop-2.0.2-alpha-src/hadoop-dist/target/hadoop-2.0.2-alpha/dfs/name, -40) (/home/ding/research/hadoop/hadoop-2.0.2-alpha-src/hadoop-dist/target/hadoop-2.0.2-alpha/dfs/name1, 123)
at org.apache.hadoop.hdfs.server.namenode.NNStorage.readAndInspectDirs(NNStorage.java:988)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:561)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:241)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:544)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:423)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:385)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:402)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:436)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:612)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:593)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1176)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1240)
2013-07-10 17:52:26,395 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2013-07-10 17:52:26,396 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
Mishandling exception --- for the mismatching layout version, should just bail instead of let it go.