ProPara Dataset (NAACL'18) - Official Release
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
View only
 
 
ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
The ProPara dataset is designed to train and test comprehension of simple paragraphs describing processes, e.g., photosynthesis. We treat the comprehension task as that of predicting, tracking, and answering questions about how entities change during the process. The dataset contains 488 paragraphs and 3300 sentences. Each paragraph is richly annotated with the locations of all the main entities (the "participants") at every time step (sentence) during the process (~81,000 annotations), stored in a "grid" (participant x sentence).
2
3
For the Website, visit http://data.allenai.org/propara/
4
5
{bhavanad,nikett,scottyih,peterc}@allenai.org, warrior.fu@gmail.com
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...
Main menu