HBase-2399 Report
When users force a split on a region, instead of splitting the largest Store, it picks the first Store to split.
As a consequence, it will cause unbalanced load and performance degradation.
Critical
No.
No
Affect any split region’s access performance
0.92.0 (reverse the patch)
mvn test -Dtest=TestAdmin#testForceSplitMultiFamily
Standard
Client force a split on a region
Single event
Yes
Yes.
2013-08-13 10:34:04,132 DEBUG [PRI IPC Server handler 6 on 37860] regionserver.CompactSplitThread(167): Split requested for testForceSplit,,1376404426884.f75e89f4 a36dacab22598e275f225ef6.. compaction_queue=(0:0), split_queue=0
2013-08-13 10:34:04,133 INFO [RegionServer:1;localhost,37860,1376404417946-splits-1376404444132] regionserver.SplitTransaction(216): Starting split of region tes tForceSplit,,1376404426884.f75e89f4a36dacab22598e275f225ef6.
1. RS + client
Users will notice the split didn’t take good effect. Then it is pretty easy to located the log:
2013-08-13 10:34:04,132 DEBUG [PRI IPC Server handler 6 on 37860] regionserver.CompactSplitThread(167): Split requested for testForceSplit,,1376404426884.f75e89f4a36dacab22598e275f225ef6.. compaction_queue=(0:0), split_queue=0
2013-08-13 10:34:04,133 INFO [RegionServer:1;localhost,37860,1376404417946-splits-1376404444132] regionserver.SplitTransaction(216): Starting split of region testForceSplit,,1376404426884.f75e89f4a36dacab22598e275f225ef6.
The logs above indicates that at least the split request is received. But it just didn’t pick the correct (largest) region to split.
It’s not hard to locate the code that picks the region to split by the dev:
/**
* @return the key at which the region should be split, or null
* if it cannot be split. This will only be called if shouldSplit
* previously returned true.
*/
byte[] getSplitPoint() {
Map<byte[], Store> stores = region.getStores();
byte[] splitPointFromLargestStore = null;
long largestStoreSize = 0;
for (Store s : stores.values()) {
splitPointFromLargestStore = s.getSplitPoint();
if (splitPointFromLargestStore != null) {
return splitPointFromLargestStore;
}
}
return null;
}
--- It always picks the first Store to split.
Instead of picking the largest store to split, it only picks the first store to split.
Semantic
Pick the largest Store:
+ for (Store s : stores.values()) {
+ byte[] splitPoint = s.getSplitPoint();
+ long storeSize = s.getSize();
+ if (splitPoint != null && largestStoreSize < storeSize) {
+ splitPointFromLargestStore = splitPoint;
+ largestStoreSize = storeSize;
+ }
+ }