ENE eval mapping
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
View only
 
 
ABCDEFGHIJKLMNOPQRSTUVWXY
1
al+ FG_NER - Stanford NER category mapping method
2
The original Stanford NER model includes 4 named entity categories : PERSON, LOCATION, ORGANIZATION and MISC (miscellaneous). On the otherhand, al+ FG_NER has 200 categories organized into a hierarchy structure which contains three levels. The top level categories in al+ FG_NER are PERSON, LOCATION, ORGANIZATION, EVENT, PRODUCT, TIMEX, ... A top level category is further divided into several second level categories, such as ORGANIZATION is divided into Political_Organization, Corporation, Sports_Organization, ... A second-level categories is also divided into several third-level categories, such as Political_Organization is divided into Political_Party, Cabinet, Government, ... Consequently, we can directly map the top level categories PERSON, LOCATION, ORGANIZATION of al+ FG_NER to Stanford NER PERSON, LOCATION, ORGANIZATION. By this way, Political_Party is automatically mapped into ORGANIZATION in Stanford categories. Similarity, for LOCATION, we also map all categories under LOCATION of al+ FG_NER (except Address, which contains URL, email, Postal Address) into Stanford LOCATION. Finally, PERSON in al+ FG_NER is directly mapped into PERSON in Stanford NER. Because the category MISC (miscellaneous) in Stanford NER contains various entity types, we cannot map some specific categories in al+ FG_NER into this category. Consequently, we don't evaluate with MISC.
3
4
5
al+ FG_NER - Google NER category mapping method
6
The named entity recognizer (NER) in Google NLP API supports 8 named entity categories : PERSON, LOCATION, ORGANIZATION, CONSUMER_GOOD, EVENT, WORK_OF_ART, OTHER, UNKNOWN. As shown above, al+ FG_NER includes 200 categories so we need to map al+ FG_NER into Google NER category for evaluation. The mapping method is similar to Stanford mapping : for PERSON, ORGANIZATION, EVENT we directly map into PERSON, ORGANIZATION, EVENT in Google NER. For LOCATION, we also directly map into Google NER LOCATION, except the categories under the Address category (e.g., email, URL, ...). For CONSUMER_GOOD category in Google NER, we map some categories under the PRODUCT top-level categories of al+ FG_NER (except Art, Language, Title, Rule, ...). For WORK_OF_ART, we directly map the second-level category Art in al+ FG_NER. The category OTHER and UNKNOWN in Google NER contains miscellaneous entity types which cannot be mapped, so we don't map and evaluate them.
7
8
9
10
スタンフォードTEST DATAのマッピングの根拠
11
Stanford NER では、4つのカテゴリが認識されており、PERSON, LOCATION, ORGANIZATION, MISC(PER, LOC, ORG以外のエンティティ)があります。
12
一方、オルツの拡張固有表現階層は200カテゴリがあり、3つの層に分けられています。トップレベルでは、PERSON, LOCATION, ORGANIZATION, EVENT, PRODUCT, TIMEXなどがあります。例えば、ORGANIZATIONはPolitical_Organization, Corporation などに分けられています。
13
そこで、Political_Organization(とその下のカテゴリ: Political_Party, Militaryなど)も全てStanfordのORGANIZATIONにマップ出来ます。
14
従って、下記のようなロジックでマッピングを行いました。
15
拡張固有表現の階層をそのままマップします。つまり、 拡張固有表現のトップレベルがLOCATIONならば、StanfordのLOCATIONにマップします
16
ただし、Address(URL, email, URL, Postal address) だけはマップしません。電子的なLOCATIONはStanfordでは認識していないからです
17
その他のカテゴリ(PERSON, ORGANIZATION)はそのままマップします。
18
StanfordのMISCは多数のカテゴリが入っているので、マップしません
19
20
googleNER TEST DATAのマッピングの根拠
21
Google NLP API のNERでは、8つのカテゴリがあります: PERSON, LOCATION, ORGANIZATION, CONSUMER_GOOD, EVENT, WORK_OF_ART, OTHER, UNKNOWN。一方、オルツは3つの階層に分けられている200カテゴリがあります。そこで、評価するために、カテゴリをマッピングする必要です。
22
PERSON, ORGANIZATION, LOCATIONのマッピングは上記と同じ理由でマップしました。
23
また、CONSUMER_GOODはオルツのProductの一部に当該するので、Productの下の当該部分をGoogleのCONSUMER_GOODにマップしました。
24
更に、GoogleのEVENTというカテゴリもありますが、これもオルツのEVENTというトップレベルカテゴリがあるので、マップしました
25
最後に、Product の中の「ART」をGoogleの WORK_OF_ART にマップします。
26
Googleの OTHER は様々な種類が混じって入っているので、マッピング、評価しません。また UNKNOWN は当該するカテゴリがないため、マッピングせずに、評価に入れていないです
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...
 
 
 
Basis_of_map
eval_mapping