HDFS-3616 Report

1. Symptom

ConcurrencyModificationException is thrown sometimes when the test case named TestWebHdfsWithMultipleNameNodes runs.

1.1 Severity

Major.

1.2 Was there exception thrown?

Yes.

org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes Time elapsed: 2.393 sec <<< ERROR!

java.util.ConcurrentModificationException

The HDFS test is aborted when the exception is thrown and the rest of the tests are skipped.

1.2.1 Were there multiple exceptions?

No.

1.3 Scope of the failure

In this case, the failure only affects the rest of the test case (it prevents all the tests coming after HDFS from running), but other manifestations could implicate in more execution abortions and exceptions. Since the Iterator is a fast-fail object, no big consequences should happen regarding this concurrency problem.

2. How to reproduce this failure

2.0 Version

2.0.0-alpha or later (the bug still persists on the present version, 2.0.5-alpha).

2.1 Configuration

Automatically made by the test case, but it doesn’t actually matter very much.This concurrency bug only needs one datanode shutting down and the precise timing to manifest itself.

2.2 Reproduction procedure

Just loop the test case until you see the bug. The following command can run the test case if performed under the hadoop folder.

mvn test -Dtest=TestWebHdfsWithMultipleNameNodes

On the current version, the frequency of the failure by this bug is 0.05%. It can be increased to 0.20% with a particular light change on the source code (Appendix A).

2.2.1 Timing order

As a concurrency bug, the timing is simply essential, but it is also out of the user’s control.

2.2.2 Events order externally controllable?

No. The CPU controls the thread switch.

2.3 Can the logs tell how to reproduce the failure?

Yes. They just don’t tell how low is the frequency of this failure (could be as low as 0.05%).

2.4 How many machines needed?

1 machine and the test case sets it up automatically.

3. Diagnosis procedure

The propagation of the error is short in this scenario. The exception is thrown immediately when the Iterator object is accessed concurrently.

The fix for this bug, though, was still not commited and the current version (2.0.5-alpha) still presents the bug. The Iterator should not be accessed concurrently and a way to assess the prevention of this concurrent access is one possible way to fix the bug.

3.1 Detailed Symptom (where you start)

ConcurrentModificationException is thrown at the HDFS test session on TestWebHdfsWithMultipleNameNodes test case.

3.2 Backward inference

-------------------------------------------------------------------------------

Test set: org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes

-------------------------------------------------------------------------------

Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.602 sec <<< FAILURE!

org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes Time elapsed: 2.393 sec <<< ERROR!

java.util.ConcurrentModificationException

at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)

at java.util.HashMap$EntryIterator.next(HashMap.java:834)

at java.util.HashMap$EntryIterator.next(HashMap.java:832)

at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:212)

at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.shutdown(FsVolumeList.java:171)

at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdown(FsDatasetImpl.java:1215)

at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:1124)

at org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:1280)

at org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:1260)

at org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes.shutdownCluster(TestWebHdfsWithMultipleNameNodes.java:100)

(...)

The stack above helps significantly on the diagnosis. It shows right away the nature of the problem (inappropriate concurrent access) and one of the services related to the problem (the shutdown of a DataNode). Finally, by looking at the last hadoop method called before the native java classes were called, we can tell that the object that makes use of the Iterator is the bpSlices. So, this method and another one were trying to access the object at the same time. The stack did not show what was the other method, but the bug assignee found it and posted on the page. So, the root cause is now well defined. The problem is how to fix it.

4. Root cause

Concurrent access to the bpSlices object from the FsVolumeImpl.shutdown() and FsVolumeImpl.shutdownBlockPool() at the same time.

4.1 Category:

Concurrency Bug.

5. Fix

5.1 How?

Prevent threads from accessing the Iterator at the same time. How to do this prevention, though, is so not trivial that the bug is still present on the current alpha release.

APPENDIX A

Statistic data regarding the hadoop version reported on the JIRA page and other later versions, including the present one - hadoop-2.0.5-alpha.

The bug was present even before 2.0.0, but for some reason it would not show up on the corresponding test case easily. Some modifications made regarding other improvements and other bugs may have exposed this bug.

On the report page, the assignee proposed a fix to the bug. In his patch, the HashMap object related to the bug, bpSlices, would become a ConcurrentHashMap object. The problem is that the bug comes from an iterator related to this HashMap, and according to Oracle, iterators are designed to be used by one thread at a time[1].

A busy waiting was added on DataNode.shutdown() method, right before the shutDownAll() function is called.

Figure 1 - Busy waiting on shutdown() method, at the DataNode class.

A call to the Thread.wait() method was made and 200 was the parameter, giving the other thread some time before the bpSlices object is removed.

Figure 2 - Wait call right before the bpSlices object is removed.

Finally, the most important change was moving a set action further. Since the shouldRun attribute is the key to initiate the exclusion of the bpSlice object, the last modification was to simply move the “shouldRun = false” a little further within the shutdown method.

Figure 3 - The comparison of a patch of the source code before (left) and after (right) an attribute set was moved further on the code.

As we can see in the following table, the concurrency bug did not manifest itself even once out of over 450 executions of the test, proving that the manifestation frequency is extremely low, at least on this test. This explains why the bug is still present on the current release.

Version	Executions	Exception Failures	Concurrent Modification Exception (CME)	Exception Frequency	CME Frequency	Extra Exception
Original 2.0.0	474	7	0	1.48%	0.00%	7 * IOException: Failed to add a datanode
Modified 2.0.0	479	68	62	14.20%	12.94%	8 * IOException: Failed to add a datanode
Original 2.0.2	588	15	5	2.55%	0.85%	10 * RemoteException: Failed to add a datanode
Original 2.0.3	602	4	4	0.66%	0.66%
Modified 2.0.3	2086	25	25	1.20%	1.20%
Original 2.0.5	2038	1	1	0.05%	0.05%
Modified 2.0.5	1991	4	4	0.20%	0.20%

The frequency of the bug on 2.0.0 was increased from 0 to almost 13% of the executions (68 out of 479) when applied all the three mentioned changes (and some log messages, irrelevant to the execution).

On 2.0.3, the two first changes were already applied, because the bug assignee commited the changes to the trunk. With the third modification, the bug frequency almost doubled. Also on 2.0.3, the other IO Exceptions and Remote Exceptions present before due to a DataNode Inclusion Failure stopped being thrown. The failures were probably reported, solved and applied to this version.

Finally, on 2.0.5, also with the two first “fixes” applied by default, the frequency was increased by 4 times after the third change was made.

REFERENCES

[1] Class ConcurrentHashMap<K,V>. Online source:

http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html

Accessed on August 8th of 2013.