Background: Streaming is primarily used by cassandra itself to transfer data from one node to another. For this case, the failure occurs in a function where Cassandra transfer matching portion of sstable (or differential data) from a node to another node. If the SSTable is the same between both nodes, no data would need to be transferred.
Failure: Inside transferSSTables function within StreamOut, we do not have special handling to deal with empty stream (i.e. SSTable between two nodes are the same).
User perception: Since this function is not directly visible to the user, we can only observe it indirectly. User will have problems with moving or restoring nodes in multi node system . Because both nodes would have old data in on disk, there is a chance Cassandra will try to modify duplicated SSTable between the two nodes. In this case, we will trigger the failure stated above.
0.7 beta 3
1. Start empty steam (feature start)
2. transferSSTables (feature start)
StreamOut only starts a stream if there are actually files to transfer. This means callbacks will never get called for streams that don't actually have anything to transfer.
The empty stream is a very obvious clue to the failure. Checking out this assumption easily identifies the problem.
Do not have code to handle empty stream.
public static void transferSSTables(StreamOutSession session, Collection<SSTableReader> sstables, Collection<Range> ranges) throws IOException
List<PendingFile> pending = createPendingFiles(sstables, ranges);
if (pending.size() > 0)