Predicting Success in Training of Navy Aviators
Neil C. Rowe and Arijit Das
Computer Science Department�U.S. Naval Postgraduate School
http://faculty.nps.edu/ncrowe
1
The challenge of Navy aviation training
2
Data used in our study
3
Cleaning the data
4
Joining the data
5
Respecting attribute order
6
Methods of predicting attribute values
7
Attributes measuring candidate success
After discussion with the sponsor, we identified 38 possible measures of success of a candidate:
8
Significant binary correlations with candidate success
9
Success-related attribute | Phase | Possible nonnull values | Nonnull occurs | Positive corrs. | Negative corrs. |
|
NGCode | PRE | 12 strings | 1761 | 0 | 0 |
|
RetestStatus | ASTB | Never, 30Days, 90Days, 180Days, Never1992, Resume | 8540 | 5 | 0
|
|
ExamineeStatus | ASTB | None | None | 0 | 0 |
|
Number of ASTB1-5 | ASTB | Floating point | 14477 | 3 | 7 | |
Number of ASTBE | ASTB | Floating point | 4564 | 3 | 7 | |
IFS_STATUS | IFS | Complete, Disenroll, Closing | 13844 | 3 | 0 |
|
IFS_STATUS_NUM | IFS | Numbers 0.0 to 1.0 | 13834 | 5 | 0 |
|
IFS_DISENROLLMENT_DESCRIPTION | IFS | String | 5787 | 5 | 13 |
|
IFS_USNA_PFP | IFS | String | 634 | 0 | 17 |
|
IFS_ACAD_FAIL | IFS | Number 0.0 to 1.0 | 13844 | 0 | 9 |
|
IFS_FLT_FAIL | IFS | Number 0.0 to 1.0 | 13844 | 7 | 5 |
|
API_NSS | API | Integer | 17401 | 17 | 2 |
|
API_Test_FAILS | API | Integer | 17446 | 8 | 8 |
|
Count of nonnull values for the API phase | API | Floating point | 18596 | 6 | 7 |
|
Pri (status in training) | PRI | G, UI, NG, AT, TG, UA | 14461 | 4 | 2 |
|
Count of nonnull values for the PRI phase | PRI | Floating point | 18596 | 16 | 1 |
|
PRI academic average | PRI | Floating point | 13555 | 17 | 4 |
|
PRI flight average | PRI | Floating point | 10664 | 19 | 3 |
|
Int (status in training) | INT | G, AT, UI, NG, MA, J | 5530 | 1 | 1 |
|
Count of nonnull values for the INT phase | INT | Floating point | 18596 | 26 | 9 |
|
INT academic average | INT | Floating point | 8153 | 22 | 4 |
|
INT flight average | INT | Floating point | 3301 | 7 | 11 |
|
Count of nonnull values for the ADV phase | ADV | Floating point | 18,596 | 29 | 10 |
|
ADV academic average | ADV | Floating point | 9712 | 21 | 4 |
|
Adv (status in training) | ADV | G, AT, UI, NG, UU, TG, SQ, UIT | 16005 | 1 | 1 |
|
ADV flight average | ADV | Floating point | 4593 | 33 | 9 |
|
SYL_ST (syllabus status) | ADV | Complete, Active, Attrite | 6292 | 5 | 2 |
|
STAT_RESN (reason for syllabus status) | ADV | 20 strings | 260 | 0 | 0 |
|
NSS_UNSATS | ADV | Number | 3012 | 1 | 0 |
|
OFFICIAL_ NMU | ADV | Number | 3012 | 1 | 0 |
|
NUM_RRU | ADV | Number | 3012 | 1 | 0 |
|
IPC | ADV | Number | 3012 | 0 | 0 |
|
FPC | ADV | Number | 3012 | 3 | 0 |
|
NSS | ADV | Floating point | 5221 | 1 | 4 |
|
Count of nonnull values for the FRS phase | FRS | Floating point | 18,596 | 37 | 13 |
|
FRS_TW6_Grade | FRS | Number | 1274 | 29 | 8 |
|
FRS_TW6_Status | FRS | Number 0.0 to 1.0 | 7682 | 36 | 7 |
|
Observed correlations
10
Finding good attribute subsets for regressions
11
Best regression formulas found (1)
12
# candi-dates | Avg. error | Best formula |
4144 | 0.0096 | 0.0242*PFAR_Z_ASTBE + 0.0048*API_NSS + 0.0171*ANIT_Z_ASTBE + 0.0098*ATTFactor_Z_ASTBE + 0.0063*Personality5_Z_ASTBE + -0.0074*Personality6_Z_ASTBE + -0.0107*OAR_Z_ASTBE + -0.0136*Number of ASTBE + 0.0057*Personality3_Z_ASTBE + -0.0102*API_Test_FAILS + 1.0659 = PRI_FLIGHT_AV |
624 | 0.0038 | -0.0123*ANIT_Z_ASTBE + 0.014*DLTFactor_Z_ASTBE + -0.0055*ATTFactor_Z_ASTBE + 0.0111*Number of ASTBE + -0.0008*API_NSS + -0.0032*VTTFactor_Z_ASTBE + 0.0069*API_Test_FAILS + 1.181 = INT_FLIGHT_AV |
2880 | 0.0042 | 0.0017*API_NSS + -0.0069*IFS_TOTAL_FLIGHT_TIME + 0.0008*IFS_EOC + 0.0008*IFS_STG_1 + -0.0093*API_Test_FAILS + 0.9477 = ADV_FLIGHT_AV |
2476 | 0.2081 | -0.0088*Number of ASTBE + 0.0053*API_NSS + 0.0011*IFS_EOC + 0.0302*ANIT_Z_ASTBE + -0.0707*API_Test_FAILS + -0.029*IFS_TOTAL_FLIGHT_TIME + -0.0457*IFS_ACAD_FAIL + 0.0196*SkillFactor_Z_ASTBE + 5.0012 = PRI_COUNT |
2694 | 0.9148 | -0.0331*IFS_TOTAL_FLIGHT_TIME + 0.1961*PFAR_Z_ASTBE + 0.0939*ANIT_Z_ASTBE + 0.0479*Personality2_Z_ASTBE + 0.0432*Personality8_Z_ASTBE + 0.039*ATTFactor_Z_ASTBE + 1.6159 = INT_COUNT |
13844 | 2.1403 | -0.0626*IFS_TOTAL_FLIGHT_TIME + 0.4278*IFS_FLT_FAIL + 2.7704 = ADV_COUNT |
12668 | 0.8013 | -0.0937*IFS_TOTAL_FLIGHT_TIME + 0.0063*API_NSS + -0.1424*IFS_ACAD_FAIL + 2.0684 = FRS_COUNT |
1254 | 588.4 | -0.828*IFS_STG_2 + 0.7108*IFS_FAA + 7.4391*IFS_ACAD_FAIL + 0.9847*IFS_TOTAL_FLIGHT_TIME + -0.1748*IFS_STG_3 + 33.4863 = FRS_TW6_Grade |
5386 | 0.0003 | 0.0*IFS_EOC + -0.0021*API_Test_FAILS + 0.0001*IFS_FAA + 0.0001*API_NSS + 0.9673 = FRS_TW6_Status |
Best regression formulas found (2)
13
# cand-idates | Avg. error | Best formula |
39 | 2.328 | -0.301*PRI_166A_CH-1_Academics_ RAW_SCORE_DV + 0.423*PRI_166A_CH-2_Academics_ RAW_SCORE_DV - 2.904*PRI_166A_CH-2_GRADE + 89.820 = ADV_ACADEMIC_AV |
857 | 8.636 | 30.087*PRI_166B_GRADE + 0.166*PRI_166B_Academics_ RAW_SCORE_DV + 37.455 = ADV_ACADEMIC_AV |
14 | 2.157 | -19.007*PRI_TW-5_166A_GRADE + 0.032*PRI_TW-5_166A_Academics_ RAW_SCORE_DV + 118.275 = ADV_ACADEMIC_AV |
63 | 2.610 | 49.237*NFO_TW-6_155C_Primary_GRADE + 0.040*NFO_TW-6_155C_Primary_Academics_ RAW_SCORE_DV + 38.292 = ADV_ACADEMIC_AV |
443 | 3.257 | 8.488*NFO_TW-6_162A_Pri2_GRADE + 0.282*NFO_TW-6_162A_Pri1_ Academics_ RAW_SCORE_DV + 61.117 = ADV_ACADEMIC_AV |
870 | 7.185 | 17.809*NFO_TW-6_162_Pri1_GRADE + 0.043*NFO_TW-6_162_Pri1_Academics_ RAW_SCORE_DV + 72.201 = ADV_ACADEMIC_AV |
16 | .00083 | 0.23217*PRI_166A_CH-1_GRADE + -0.00062*PRI_166A_CH-1_Academics_ RAW_SCORE_DV + 0.80358 = ADV_FLIGHT_AV |
650 | .00012 | -0.07254*PRI_166B_GRADE + 1.24738 = ADV_FLIGHT_AV |
64 | 00000 | 0.14389*NFO_TW-6_155C_Primary_GRADE + 0.00017*NFO_TW-6_155C_Primary_Academics_ RAW_SCORE_DV + 0.84174 = ADV_FLIGHT_AV |
871 | .00012 | 0.29038*NFO_TW-6_162A_Pri1_GRADE + 0.0009*NFO_TW-6_162A_Pri1_Academics_ RAW_SCORE_DV + 0.60225 = ADV_FLIGHT_AV |
626 | .00007 | 0.46048*NFO_TW-6_162_Pri1_GRADE - 0.00003*NFO_TW-6_162_Pri2_Academics_ RAW_SCORE_DV + 0.52616 = ADV_FLIGHT_AV |
A database design
14
Conclusions
15