ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
2
Centre for Humanitarian Data
3
MODEL EVALUATION MATRIX: 510 Typhoon Impact Model - February 2021
4
Model Card Section Strengths WeaknessHow weaknesses are currently addresses
(to be filled by the Client after step 5)
5
1. Intended UseClear definition of use case and clear overview of limitations and assumptions. The threshold based on %damaged house / people living in municipality seems rebust and insightful.There are several thresholds that need to be determined, it is not clear who sets those. Given several open questions about the evaluation part of the model card, the in-scope use case might need some clarifications. While the model is trained using percentages, number of houses damaged is used for the trigger threshold as it provides an indicator of the scale of the disaster.
6
2. Model DevelopmentVery clear overview of used data and how the model depend on data input and quality of the data. Model methodology is clear and build on out-of-the-box packages. I have several questions on data in/output (see comments on model card). Specifically: is data quality correlated with social vulnerability indicators? How are physical structure indicators of wall/roof material associated with social vulnerability indicators? And: there is a mention of useing facebook data for settlement layers in the text but not in the table. It would be nice to be clear on what part of the input data originates from facebook data and how authors plan to ensure that such data is available in the future. An analysis was performed to establish a correlation between people affected and houses damaged.

The Facebook settlement data estimates population distribution from satellite imagery, and is used for the computation of hazard maps. In case it is no longer available, there are alternative population raster sources available such as WorldPop.
7
3. Model EvaluationEvaluation against one benchmark on six accuracy measures; Comparison of expected/observed damage in clear plots on both current model and benchmarkNo clear interpretation of the shown plots. On which outcomes was the model compared? See comments on the model card for questions on the evaluation. Outlier data is mainly due to typhoon Hayan which caused unprecedented surge, not yet captured in the model
8
4. Operational ReadinessOpen Project on github for everyone to assess and run within a docker environment (no detailed code check was performed during this review). Just a minor question on how it is decided when the model will be run. On every typhoon that is active in the area? On specific subtypes?The model is run once a typhoon is located in the Philippines area of responsibility (PAR), and is strong enough to be included in the ECMWF modelling
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100