SSTableScanner could skip rows with vnodes. This SSTableScanner is used by SSTableReader, which is used by the any tasks, including user-read, to read the SSTable.
This means that to a user, the data cannot be read, and effectively seems like dataloss.
Single file. The failure affected the core of Cassandra. Because of this, dataloss would be expected. The severity of dataloss depends on how many vnodes and nodes there are. The more vnodes& nodes, the percentage of dataloss.
Must be configured using vnodes
Too many ways to trigger it: as long as the Scanner is triggered.
1. insert some data
2. force a flush (so data will be flushed to on-disk SSTable).
3. Read the data
Then read will fail.
SSTableScanner does not support vnodes properly. Keys skipped in the ringspace under certain conditions.
When looking through the SSTableScanner code, we noticed that developer made a wrong assumption. Due to the way vnode is designed, to be at the real beginning of the SSTable, we should do ifile.seek(0) and dfile.seek(0).
Yes, but only with testcase. It is hard to test it from a general perspective.
The fix fixes developer’s wrong assumption. In addition to indexPosition == -1, ifile.seek(0) and dfile.seek(0) must be performed in order to guarantee we are at the beginning of the SSTable.