HDFS-3436 Report
https://issues.apache.org/jira/browse/HDFS-3436
When appending to a file, the append failed (exception thrown and content not appended).
13/06/20 14:39:15 WARN hdfs.DFSClient: DataStreamer Exception
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
Major
Yes. There are multiple exceptions from DNs.
2013-06-20 14:39:15,664 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1013, type
=HAS_DOWNSTREAM_IN_PIPELINE
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:993)
2013-06-20 14:39:15,673 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for BP-1107653285-127.0.1.1-1371753149490:blk_84624099255480397
50_1013
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
at sun.nio.ch.IOUtil.read(IOUtil.java:171)
2013-06-20 14:39:15,675 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1013 received exception java.io.IOException: Connection reset by peer
2013-06-20 14:39:15,675 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: research-virtual-machine:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.124.129:56689 dest: /172.16.124.129:50010
java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcher.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
at sun.nio.ch.IOUtil.read(IOUtil.java:171)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:245)
at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:54)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
2013-06-20 14:39:15,640 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode{data=FSDataset{dirpath='[/home/ding/research/hadoop/hadoop-2.0.0-alpha-sr c/hadoop-dist/target/hadoop-2.0.0-alpha/dfs/data/current]'}, localName='research-virtual-machine:50010', storageID='DS-1891234353-127.0.1.1-50010-1371753278058', xmitsInProgress=0}:Exception transfering block BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1013 to mirror 172.16.124.130:50010: java.net.ConnectException: Connection refused
2013-06-20 14:39:15,640 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1013 received exception java.net.ConnectException: Connection refused
2013-06-20 14:39:15,642 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: research-virtual-machine:50010:DataXceiver error processing WRITE_BLOCK operation
src: /172.16.124.129:60109 dest: /172.16.124.128:50010
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
.. ..
2013-06-20 14:39:15,663 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: research-virtual-machine:50010:DataXceiver error processing TRANSFER_BLOCK operati on src: /172.16.124.129:60110 dest: /172.16.124.128:50010
java.io.IOException: BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1012 is neither a RBW nor a Finalized, r=ReplicaBeingWritten, blk_8462409925548039750_1013, RBW
2013-06-20 14:38:47,532 DEBUG org.apache.hadoop.hdfs.server.datanode.DataNode: Sending heartbeat from service actor: Block pool BP-1107653285-127.0.1.1-1371753149490 (
2013-06-20 14:38:48,609 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at research-virtual-machine/127.0.1.1
************************************************************/
Clearly, there is a chain of exceptions, and we have:
DN3 shutdown → Connection refused exception on DN2 (when trying to connect to DN3) → “Premature EOF” exception on DN1 → “Premature EOF” exception on client
ding@research-virtual-machine:~/research/hadoop/hadoop-2.0.0-alpha-src/append_prog$ java -cp `hadoop classpath`:build/classes/ mypackage.HDFS_4022
Hello World
13/07/09 14:57:10 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink as 172.16.124.131:50010
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1053)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:943)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
13/07/09 14:57:10 WARN hdfs.DFSClient: Error Recovery for block BP-1854916357-127.0.1.1-1373395657208:blk_-5676536617205446517_1011 in pipeline 172.16.124.129:50010, 172.16.124.130:50010, 172.16.124.131:50010: bad datanode 172.16.124.131:50010
But when you look at the tail of the file, you can see the content is appended!
Yes.
java.net.ConnectException
java.io.IOException
java.io.EOFException
Only one client’s request is affected.
2.0.0-alpha
A) set up: 4 nodes (straightforward)
--- Question: will we really need all 4 nodes?
Use baker221 for this failure:
ding-hdp-2.0.0-alpha-amd64-1: 172.16.124.129 (NN + DN)
ding-hdp-2.0.0-alpha-amd64-2: 172.16.124.128
ding-hdp-2.0.0-alpha-amd64-3: 172.16.124.130
ding-hdp-2.0.0-alpha-amd64-4: 172.16.124.131
B) set up the master:
core-site.xml: set every node’s core-site.xml to point to the NN (129)
slaves: add all 4 nodes into the “slaves”
hdfs-site.xml: replication factor set to 4
Set the replication factor to 3!
C) set up all the slaves
modify: core-site.xml, masters, hdfs-site.xml (is it really necessary?)
core-site.xml:
<property>
<name>fs.default.name</name>
<!-- <value>hdfs://localhost:8020</value> -->
<!-- <value>hdfs://dlrz.eecg.utoronto.ca:54310</value> -->
<!-- <value>hdfs://128.100.241.109:54310</value> -->
<value>hdfs://172.16.124.129:54310</value>
<final>true</final>
</property>
D) format the nn:
hadoop namenode -format
Also remember to delete the data directory in each DN.
E) start hdfs: start-dfs.sh on NN
F) upload a file:
hdfs dfs -put ./random.txt /
Verify that the file is being replicated:
hadoop fsck / -files -locations -blocks
Note: you need to check the NN’s log (located in logs/hadoop-ding-namenode-MACHINENAME.log) to see the datanode pipeline sequence for the last block of the file. For example, in my case, I found this msg:
2013-06-20 14:37:27,041 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: updatePipeline(block=BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1011, newGenerationStamp=1012, newLength=23183360, newNodes=[172.16.124.129:50010, 172.16.124.128:50010, 172.16.124.130:50010], clientName=DFSClient_NONMAPREDUCE_-1158938200_1)
G) Now stop DN3 in the write pipeline:
In this case, this DN is 172.16.124.130 (see above)
hadoop-daemon.sh stop datanode
--- Question: will the error still occur if we stop this DN3 for a while and then append?
Answer: No, but you need to wait for quite a while. At some point the NN will notice that DN3 is down, and then replicate all the blocks on that DN to a new DN:
2013-06-21 12:46:38,040 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* ask 172.16.124.131:50010 to replicate blk_-5787218417578301815_1038 to datanode(s) 172.16.124.130:50010
H) append to the file:
java -cp `hadoop classpath`:build/classes/ mypackage.HDFS_4022
Hello World
--- this java program (see below) does nothing more than appending to a file.
ding@research-virtual-machine:~/research/hadoop/hadoop-2.0.0-alpha-src/hadoop-dist/target/hadoop-2.0.0-alpha/append_prog$ java -cp `hadoop classpath`:build/classes/ mypackage.HDFS_4022
Hello World
13/06/20 14:39:15 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink as 172.16.124.130:50010
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1053)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:943)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
13/06/20 14:39:15 WARN hdfs.DFSClient: Error Recovery for block BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1012 in pipeline 172.16.124.129:50010, 172.16.124.128:50010, 172.16.124.130:50010: bad datanode 172.16.124.130:50010
13/06/20 14:39:15 DEBUG datatransfer.DataTransferProtocol: lastAckedSeqno = -1
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Connecting to datanode 172.16.124.128:50010
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Send buf size 131071
13/06/20 14:39:15 WARN hdfs.DFSClient: DataStreamer Exception
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
.. ..
Exception in thread "main" java.io.EOFException: Premature EOF: no length prefix available
.. ..
13/06/20 14:39:15 ERROR hdfs.DFSClient: Failed to close file /random.txt
java.io.EOFException: Premature EOF: no length prefix available
.. ..
The complete logs are located at:
baker221: ~ding/research/failure_cases/hdfs-3436
Java program (same as the one used in 4022):
package mypackage;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hdfs.DistributedFileSystem;
import org.apache.hadoop.io.IOUtils;
public class HDFS_4022 {
public static void main (String[] args) throws Exception {
System.out.println("Hello World");
Configuration conf= new Configuration();
DistributedFileSystem dfs= null;
dfs= (DistributedFileSystem) DistributedFileSystem.get(URI.create("hdfs://192.168.103.137:54310/"), conf);
Path f= new Path ("/random.txt");
FSDataOutputStream fsDataOutputStream= dfs.append(f);
fsDataOutputStream.write("First Battle of Bull Run (First Manassas)".getBytes());
fsDataOutputStream.close();
}
}
A) Shutdown DN3 in the write pipeline
B) Append to the file
B) should occur relatively quickly after A).
Yes. If A) and B) occur in this particular sequence, the failure will occur!
Probably.
- Last DN of the pipeline of the LAST block is down
- Then append!
4. (or 3),
need a 3-nodes pipeline.
OK, now, what is the problem?
The exception:
I do not understand the meaning of this exception. What is the expected behavior? What does it mean by “Premature EOF: no length prefix available”?
But at least it tells clearly where in the code it is printed.
It also tells about the troublesome block:
blk_8462409925548039750_1012
:
Since there are too many log msgs from 5 components, we first look at the relevant ones:
13/06/20 14:39:14 WARN util.KerberosName: Kerberos krb5 configuration not found, setting default realm to empty
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Short circuit read is false
13/06/20 14:39:15 DEBUG hdfs.DFSClient: computePacketChunkSize: src=/random.txt, chunkSize=475, chunksPerPacket=1, packetSize=506
13/06/20 14:39:15 DEBUG hdfs.LeaseRenewer: Lease renewer daemon for [DFSClient_NONMAPREDUCE_-1152429769_1] with renew id 1 started
13/06/20 14:39:15 DEBUG hdfs.DFSClient: DFSClient writeChunk allocating new packet seqno=0, src=/random.txt, packetSize=506, chunksPerPacket=1, bytesCurBlock=2318
13/06/20 14:39:15 DEBUG hdfs.DFSClient: DFSClient writeChunk packet full seqno=0, src=/random.txt, bytesCurBlock=23183442, blockSize=67108864, appendChunk=true
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Queued packet 0
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Queued packet 1
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Waiting for ack for: 1
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Append to block BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1012
13/06/20 14:39:15 DEBUG hdfs.DFSClient: pipeline = 172.16.124.129:50010
13/06/20 14:39:15 DEBUG hdfs.DFSClient: pipeline = 172.16.124.128:50010
13/06/20 14:39:15 DEBUG hdfs.DFSClient: pipeline = 172.16.124.130:50010
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Connecting to datanode 172.16.124.129:50010
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Send buf size 131071
13/06/20 14:39:15 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.io.IOException: Bad connect ack with firstBadLink as 172.16.124.130:50010
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1053)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:943)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
13/06/20 14:39:15 WARN hdfs.DFSClient: Error Recovery for block BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1012 in pipeline 172.16.124.129:5001
13/06/20 14:39:15 DEBUG datatransfer.DataTransferProtocol: lastAckedSeqno = -1
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Connecting to datanode 172.16.124.128:50010
13/06/20 14:39:15 DEBUG hdfs.DFSClient: Send buf size 131071
13/06/20 14:39:15 WARN hdfs.DFSClient: DataStreamer Exception
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:865)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:843)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
Exception in thread "main" java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:865)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:843)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
13/06/20 14:39:15 ERROR hdfs.DFSClient: Failed to close file /random.txt
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:865)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:843)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:934)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:461)
2013-06-20 14:37:26,778 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.appendFile: file /random.txt for DFSClient_NONMAPREDUCE_-1158938200_1 at 172.16.124.129 block BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1011{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.124.129:50010|RBW], ReplicaUnderConstruction[172.16.124.128:50010|RBW], ReplicaUnderConstruction[172.16.124.130:50010|RBW]]} block size 23183360
2013-06-20 14:37:27,219 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Reported block blk_8462409925548039750_1012 on 172.16.124.130:50010 size 23183401 replicaState = FINALIZED
2013-06-20 14:37:27,223 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Reported block blk_8462409925548039750_1012 on 172.16.124.128:50010 size 23183401 replicaState = FINALIZED
2013-06-20 14:37:27,226 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Reported block blk_8462409925548039750_1012 on 172.16.124.129:50010 size 23183401 replicaState = FINALIZED
2013-06-20 14:39:15,663 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* block RECEIVING_BLOCK: blk_8462409925548039750_1013 is received from DatanodeRegistration(172.16.124.128, storageID=DS-1891234353-127.0.1.1-50010-1371753278058, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=CID-2d21384f-d2ef-4922-893c-fe208aa332fb;nsid=1270822596;c=0)
Then there was an abnormal append, the genstamp becomes: 1013 on DN1 and DN2. But nothing suspicious from NN’s log.
2013-06-20 14:39:15,646 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Appending to replica FinalizedReplica, blk_8462409925548039750_1012, FINALIZED
2013-06-20 14:39:15,663 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 172.16.124.130:50010
2013-06-20 14:39:15,663 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 172.16.124.130:50010
--- These 2 msgs seems OK: DN130 cannot be reached..
2013-06-20 14:39:15,664 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1013, type=HAS_DOWNSTREAM_IN_PIPELINE
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:993)
at java.lang.Thread.run(Thread.java:662)
--- This EOFException seems suspicious...
2013-06-20 14:39:15,673 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1013
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
.. .. ..
2013-06-20 14:39:15,675 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: research-virtual-machine:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.124.129:56689 dest: /172.16.124.129:50010
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
.. .. ..
This “Premature EOF” exception is suspicious (and is also seen in client). Let’s see where they are printed:
/* Client code. */
private boolean createBlockOutputStream(DatanodeInfo[] nodes, long newGS,
boolean recoveryFlag) {
.. .. ..
try {
.. .. ..
out = new DataOutputStream(new BufferedOutputStream(
NetUtils.getOutputStream(s, writeTimeout),
HdfsConstants.SMALL_BUFFER_SIZE));
.. .. ..
// receive ack for connect
BlockOpResponseProto resp = BlockOpResponseProto.parseFrom(
HdfsProtoUtil.vintPrefixed(blockReplyStream));
pipelineStatus = resp.getStatus();
firstBadLink = resp.getFirstBadLink();
if (pipelineStatus != SUCCESS) {
if (pipelineStatus == Status.ERROR_ACCESS_TOKEN) {
throw new InvalidBlockTokenException(
"Got access token error for connect ack with firstBadLink as "
+ firstBadLink);
} else {
throw new IOException("Bad connect ack with firstBadLink as "
+ firstBadLink);
--- This was printed...
}
}
}
.. .. ..
}
After the first exception is received, the client tries to recover, and tried setupPipelineForAppendOrRecovery again.
Locate the source:
DN1’s code:
public static InputStream vintPrefixed(final InputStream input)
throws IOException {
final int firstByte = input.read();
← Didn’t read good stuffs from the DN..
if (firstByte == -1) {
throw new EOFException("Premature EOF: no length prefix available");
}
.. ..
}
Caller:
private void transfer(final DatanodeInfo src, final DatanodeInfo[] targets,
final Token<BlockTokenIdentifier> blockToken) throws IOException
{
//transfer replica to the new datanode
Socket sock = null;
DataOutputStream out = null;
DataInputStream in = null;
try {
sock = createSocketForPipeline(src, 2, dfsClient);
.. ..
//ack
BlockOpResponseProto response =
BlockOpResponseProto.parseFrom(HdfsProtoUtil.vintPrefixed(in));
}
This is exception is never handled...Basically, this error is when the socket created by DN1 contained nothing. This socket is to the next DN2. Then let's examine DN2's log.
2013-06-20 14:39:15,634 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Appending to replica FinalizedReplica, blk_8462409925548039750_1012, FINALIZED
2013-06-20 14:39:15,640 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode{data=FSDataset{dirpath='[/home/ding/research/hadoop/hadoop-2.0.0-alpha-src/hadoop-dist/target/hadoop-2.0.0-alpha/dfs/data/current]'}, localName='research-virtual-machine:50010', storageID='DS-1891234353-127.0.1.1-50010-1371753278058', xmitsInProgress=0}:Exception transfering block BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1013 to mirror 172.16.124.130:50010: java.net.ConnectException: Connection refused
2013-06-20 14:39:15,640 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1013 received
exception java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
.. .. ..
2013-06-20 14:39:15,663 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: research-virtual-machine:50010:DataXceiver error processing TRANSFER_BLOCK operation src: /172.16.124.129:60110 dest: /172.16.124.128:50010
java.io.IOException: BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1012 is neither a RBW nor a Finalized, r=ReplicaBeingWritten, blk_8462409925548039750_1013, RBW
getNumBytes() = 23183401
getBytesOnDisk() = 23183401
getVisibleLength()= 23183401
getVolume() = /home/ding/research/hadoop/hadoop-2.0.0-alpha-src/hadoop-dist/target/hadoop-2.0.0-alpha/dfs/data/current
getBlockFile() = /home/ding/research/hadoop/hadoop-2.0.0-alpha-src/hadoop-dist/target/hadoop-2.0.0-alpha/dfs/data/current/BP-1107653285-127.0.1.1-1371753149490/current/rbw/blk_8462409925548039750
bytesAcked=23183401
bytesOnDisk=23183401
at org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2048)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:524)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:114)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:78)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189)
at java.lang.Thread.run(Thread.java:662)
2013-06-20 14:38:48,609 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at research-virtual-machine/127.0.1.1
************************************************************/
--- Nothing suspicious here..
Locate the source in DN2:
/**
* Transfer a replica to the datanode targets.
* @param b the block to transfer.
* The corresponding replica must be an RBW or a Finalized.
* Its GS and numBytes will be set to
* the stored GS and the visible length.
* @param targets
* @param client
*/
void transferReplicaForPipelineRecovery(final ExtendedBlock b,
final DatanodeInfo[] targets, final String client) throws IOException {
//get replica information
synchronized(data) {
if (data.isValidRbw(b)) {
stage = BlockConstructionStage.TRANSFER_RBW;
} else if (data.isValidBlock(b)) {
stage = BlockConstructionStage.TRANSFER_FINALIZED;
} else {
final String r = data.getReplicaString(b.getBlockPoolId(), b.getBlockId());
throw new IOException(b + " is neither a RBW nor a Finalized, r=" + r);
“java.io.IOException: BP-1107653285-127.0.1.1-1371753149490:blk_8462409925548039750_1012 is neither a RBW nor a Finalized, r=ReplicaBeingWritten, blk_8462409925548039750_1013, RBW”
}
.. .. ..
}
Here:
b: blk_8462409925548039750_1012
r: ReplicaBeingWritten, blk_8462409925548039750_1013, RBW
Now, you can see the mismatch of the genstamp. 'b' is from the client; 'r' is from the local repo.
FsDatasetImpl.java: isValidBlock returns true if the block is valid.
/**
* Check whether the given block is a valid one.
* valid means finalized
*/
public boolean isValidBlock(ExtendedBlock b) {
return isValid(b, ReplicaState.FINALIZED); // →FsDatasetImpl.isValid (ExtendedBlock, ReplicaState)
}
FsDatasetImpl.java:
/** Does the block exist and have the given state? */
private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(),
b.getLocalBlock()); // →ReplicaMap.get(String bpid, Block)
return replicaInfo != null
&& replicaInfo.getState() == state
&& replicaInfo.getBlockFile().exists();
}
/**
* Get the meta information of the replica that matches both block id
* and generation stamp
* @param bpid block pool id
* @param block block with its id as the key
* @return the replica's meta information
* @throws IllegalArgumentException if the input block or block pool is null
*/
ReplicaInfo get(String bpid, Block block) {
checkBlockPool(bpid);
checkBlock(block);
ReplicaInfo replicaInfo = get(bpid, block.getBlockId());
if (replicaInfo != null &&
block.getGenerationStamp() == replicaInfo.getGenerationStamp()) {
← 1100 (from Client) != 1101 (local replica) This if test failed!!
return replicaInfo;
}
return null;
}
The reason, similar to 4022, is the mismatch of genstamp. The genstamp from the network (from the client) and from the local repo doesn't match!
In fact, if you take a look at the NN or DN2's log, you can see that the genstamp becomes 1013.
The root cause: while the failing append caused the genstamp to become 1013 in DN1 and DN2, the client is not aware of it. Therefore when there is a network error, the client asks the DNs to find another node to replicate this block, but gave them the wrong genstamp, causing DN2 to throw an error and cause the exception of "premature EOF" in DN1.
Client uses a stale generation stamp of a block, causing DN to refuse replicate it.
Incorrect exception handling. (DN3 is down, which happens to be the last DN in an append pipeline).
Simple simple simple: in this case, in DN2's code, update the genstamp
before verify the validity of the block:
Index:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
===================================================================
---
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java(revision
1341741)
+++
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java(working
copy)
@@ -2028,6 +2028,18 @@
//get replica information
synchronized(data) {
+ Block storedBlock = data.getStoredBlock(b.getBlockPoolId(),
+ b.getBlockId());
+ if (null == storedBlock) {
+ throw new IOException(b + " not found in datanode.");
+ }
+ storedGS = storedBlock.getGenerationStamp();
+ if (storedGS < b.getGenerationStamp()) {
+ throw new IOException(storedGS
+ + " = storedGS < b.getGenerationStamp(), b=" + b);
+ }
+ // Update the genstamp with storedGS
+ b.setGenerationStamp(storedGS);
if (data.isValidRbw(b)) {
stage = BlockConstructionStage.TRANSFER_RBW;
} else if (data.isValidBlock(b)) {
@@ -2036,18 +2048,9 @@
final String r = data.getReplicaString(b.getBlockPoolId(),
b.getBlockId());
throw new IOException(b + " is neither a RBW nor a Finalized,
r=" + r);
}
-
- storedGS = data.getStoredBlock(b.getBlockPoolId(),
- b.getBlockId()).getGenerationStamp();
- if (storedGS < b.getGenerationStamp()) {
- throw new IOException(
- storedGS + " = storedGS < b.getGenerationStamp(), b=" + b);
- }
visible = data.getReplicaVisibleLength(b);
}
-
- //set storedGS and visible length
- b.setGenerationStamp(storedGS);
+ //set visible length
b.setNumBytes(visible);