HBase-2545 Report
Unresponsive region server, potential deadlock
Region server infinite loop (hang) on client’s Get or Scan requests.
Blocker
No
No
The clients GET or SCAN on certain tables.
0.20.4
Standard
The testcase “TestExplicitColumnTracker” contains the dataset to construct a table that will end up with this infinite loop
Single event
Yes
Hard -- This will be an interesting case to test our failure repro tool.
1. (RS + client)
Users took jstack:
"IPC Server handler 10 on 60020" daemon prio=10 tid=0x00002aacb6844000 nid=0xcc2 runnable [0x0000000042f56000]
java.lang.Thread.State: RUNNABLE
at org.apache.hadoop.hbase.regionserver.ExplicitColumnTracker.checkColumn(ExplicitColumnTracker.java:128)
at org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:165)
at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:176)
- locked <0x00002aaacd690ef0> (a org.apache.hadoop.hbase.regionserver.StoreScanner)
at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:106)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.nextInternal(HRegion.java:1923)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1887)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2507)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2493)
at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1742)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
Many threads are stuck here..
This jstack is super important. Developers quickly located the code:
public MatchCode checkColumn(byte [] bytes, int offset, int length) {
+ boolean recursive;
do {
+ recursive = false;
// No more columns left, we are done with this query
if(this.columns.size() == 0) {
return MatchCode.DONE; // done_row
}
// No more columns to match against, done with storefile
if(this.column == null) {
return MatchCode.NEXT; // done_row
}
// Compare specific column to current column
int ret = Bytes.compareTo(column.getBuffer(), column.getOffset(),
column.getLength(), bytes, offset, length);
// Matches, decrement versions left and include
if(ret == 0) {
if(this.column.decrement() == 0) {
// Done with versions for this column
this.columns.remove(this.index);
if(this.columns.size() == this.index) {
// Will not hit any more columns in this storefile
this.column = null;
} else {
this.column = this.columns.get(this.index);
}
}
return MatchCode.INCLUDE;
}
// Specified column is bigger than current column
// Move down current column and check again
if(ret <= -1) {
if(++this.index == this.columns.size()) {
// No more to match, do not include, done with storefile
return MatchCode.NEXT; // done_row
}
this.column = this.columns.get(this.index);
recursive = true;
continue;
}
- } while(true);
+ } while(recursive);
+ return MatchCode.SKIP; // skip to next column, with hint?
}
--- Infinite loop on certain data format.
Infinite loop on certain format of data.
Infinite loop
Break from the loop