Multiget_slice: Takes in a list of keys to go fetch them in the cassandra database. It is usually used fetch a large number of specific rows according to the row key.
Similar SQL command: SELECT * FROM table where Primark_Key IN (2, 32, 76, 1000, 2427)
The user is having a problem with the value returned by multiget_slice command on a super column family. After updating a value in super column family, retrieving the updated value returns the old value prior to updating.
no, cassandra returns wrong (outdated) result without any error or warning
standard configuration with 1 node
1) Create one or more super column entries (file write)
2) Use nodetool to flush the column family (feature start)
3) Update the sub column values(file write)
4) Read the updated value(file read)
no, no error message or warnings. Just wrong result.
User is having a problem with doing a multiget_slice on a super column family
after its first flush. Updates to the column values work properly, but
trying to retrieve the updated values using a multiget_slice operation fail
to get the updated values. Instead they return the values from before the
flush. The problem is not apparent with standard column families.
There is not a obvious backward inference for this besides a developer giving insight to a similar problem through domain knowledge. The other problem appears to have trouble in collating data (putting data together to form result) from Memtables and SStables, but only when query involves SuperColumns. The developer did not have a proper fix. It was not tested for regressions or concurrency, but this lead gave a direction for other developers. Thus two more problem which relates to this failure is found. The first error is that the name-based path in CollationController stops as soon as it finds one subcolumn in a given supercolumn. The second one can hide the first one. SuperColumn.minTimestamp is calculated incorrectly. SuperColumn.minTimestamp is used to “short-circuit” a Supercolumn read for optimization. It causes problem here because we don’t know how many potential subcolumns within a super column without an exhaustive search. SuperColumn.minTimestamp sets a limit for iterative Search of subcolumns.
no, logs doesn't really tell give us useful information
yes. This is failure is triggered by 3 problems with cassandra.
Two bugs during collation and one with SuperColumn.minstamp prematurely terminated the iterative search for subcolumns.
Very hard. Depends on domain knowledge.
There are 3 root causes to this failure.
1) During collating data from Memtables and SSTables, developer used an algorithm called treemap. They algorithm they used was not properly written for super column.
2) During collating data from Memtables and SSTables, resolving path stops as soon as it finds one subcolumn in a given super column.
3) SuperColumn.minTimestamp value prematurely erminated the iterative search for subcolumns.
Summary: all three root causes affect the collation of super column data.
Fix was simple:
1) tree map algorith fix: fix the problem where column in treeMapBackedSorted Columns cannot be resolved
* If we find an old column that has the same name
* the ask it to resolve itself else add the new column
public void addColumn(IColumn column, Allocator allocator)
ByteBuffer name = column.name();
IColumn oldColumn = put(name, column);
if (oldColumn != null)
if (oldColumn instanceof SuperColumn)
assert column instanceof SuperColumn;
+ // since oldColumn is where we've been accumulating results, it's usually going to be faster to
+ // add the new one to the old, then place old back in the Map, rather than copy the old contents
+ // into the new Map entry.
((SuperColumn) oldColumn).putColumn((SuperColumn)column, allocator);
+ put(name, oldColumn);
2) path resolvation fix: resolving path stops as soon as it finds one subcolumn in a given super column.
public ColumnFamily getTopLevelColumns()
- return filter.filter instanceof NamesQueryFilter && cfs.metadata.getDefaultValidator() != CounterColumnType.instance
+ return filter.filter instanceof NamesQueryFilter
+ && (cfs.metadata.cfType == ColumnFamilyType.Standard || filter.path.superColumnName != null)
+ && cfs.metadata.getDefaultValidator() != CounterColumnType.instance
3) SuperColumn.minTime fix: resolving path stops as soon as it finds one subcolumn in a given super column.
ByteBuffer filterColumn = iterator.next();
IColumn column = container.getColumn(filterColumn);
- if (column != null && column.minTimestamp() > sstableTimestamp)
+ if (column != null && column.timestamp() > sstableTimestamp)
6. Scope of the failure
single super column / single file (containing super column file structure)