CASSANDRA-3288

1. Symptom

When trying to add a column family through Hector client, the operation fails with  me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra encountered an internal error processing this request. Client simply terminates.

Category (in the spreadsheet):

early termination,

1.1 Severity

Critical

1.2 Was there exception thrown? (Exception column in the spreadsheet)

Yes. me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra encountered an internal error processing this request.

 

1.2.1 Were there multiple exceptions?

no

1.3 Was there a long propagation of the failure?

no

1.4 Scope of the failure (e.g., single client, all clients, single file, entire fs, etc.)

single client

Catastrophic? (spreadsheet column)

no

2. How to reproduce this failure

2.0 Version

1.0.0

2.1 Configuration

Standard Configuration. We must pre insert a column family through CLI.  The first Column family should have an ID with 0.

 

# of Nodes?

1

2.2 Reproduction procedure

1. create a column family with ID=0 (file write)

2. Using Hector’s system_add_column_family api to add a column family in Cassandra(feature start)

 

Num triggering events

2

 

2.2.1 Timing order (Order important column)

yes

2.2.2 Events order externally controllable? (Order externally controllable? column)

yes

2.3 Can the logs tell how to reproduce the failure?

yes

2.4 How many machines needed?

1

 

3. Diagnosis procedure

Error msg?

yes

3.1 Detailed Symptom (where you start)

When trying to add a column family through Hector client, the operation fails with Cassandra encountered an internal error processing this request message.

3.2 Backward inference

After seeing the Hector client error message while trying to add a column family, the first step of diagnosis is to take a look at the data passed to Cassandra. We see a column family definition (schema of column family) being passed to Cassandra just before the error. Inside the column family definition, we noticed the column family ID is 0. However an already existing entry inside Cassandra with column family ID of 0. Because of this duplicated ID, cassandra is preventing with the “add column family” operation from completing successfully. Through further analysis, we realized the column family ID in the column family definition is just a placeholder and serves no purpose. Since the ID serves no purpose, we can safely ignore it on Cassandra side.

3.3 Are the printed log sufficient for diagnosis?

yes

 

4. Root cause

Redundant ID field in column family definition conflicts with existing data inside Cassandra. Cassandra did not ignore the redundant field. When handling add column family operation, the passed in ID is a conflict with existing column family ID.

4.1 Category:

semantic

4.2 Are there multiple fault?

no

4.2 Can we automatically test it?

yes

5. Fix

5.1 How?

Ignore the column family ID on the server side when passed in through column family definition.

+ cf_def.unsetId(); // explicitly ignore any id set by client (Hector likes to set zero)