CASSANDRA-1646

1. Symptom

Using QUORUM and replication factor of 1 in unit test causes timeout exception

 

Category (in the spreadsheet):

hang

 

1.1 Severity

Critical

1.2 Was there exception thrown? (Exception column in the spreadsheet)

yes, timeout exception

 

1.2.1 Were there multiple exceptions?

no

 

1.3 Was there a long propagation of the failure?

no

 

1.4 Scope of the failure (e.g., single client, all clients, single file, entire fs, etc.)

all client, running the same test

 

Catastrophic? (spreadsheet column)

no

 

2. How to reproduce this failure

2.0 Version

0.7 beta 3

2.1 Configuration

standard configuration with replication factor = 1 and consistency = QUORUM

 

# of Nodes?

1

2.2 Reproduction procedure

1. start testMutation testcase (feature start)

 

Num triggering events

1

2.2.1 Timing order (Order important column)

NA

2.2.2 Events order externally controllable? (Order externally controllable? column)

yes

2.3 Can the logs tell how to reproduce the failure?

yes

2.4 How many machines needed?

1

3. Diagnosis procedure

Error msg?

yes, timeout error

3.1 Detailed Symptom (where you start)

When running testMutation testcase with QUORUM and replication factor of 1 will cause timeout error.

3.2 Backward inference

We see that we are testing for a non-existent column. We did not have an exception handling code to handle the case where no columns are found.

 

4. Root cause

missing error handling for the case where no column is found

4.1 Category:

semantic

4.2 Are there multiple fault?

no

4.2 Can we automatically test it?

yes

5. Fix

5.1 How?

Created error handling code to handle the case where a non-existent column could not be found.