ABCDEFGHIJKLMNOPQRSTUVWXYZAAABAC
1
PStage NameStepsPriorityDependenciesFunding Ideas ?TimDev ?Wikipedia
Archive.org
CommunityPerson Days
2
3
0
Pre-public Steps
4
5
Test-drive
6
- Get API working with examplesX
7
- Fix examplesX
8
9
Create New Video
10
- Jefferson quoteX
11
- ExamplesX
12
- WikipediaX
13
14
Business PlanX
15
16
1
12-month plan
Stage 0Sloan Foundation ?
17
Summary:
18
A) Convert existing Wikipedia citations to Contextual Citations.
19
- markup all articles to link quotes with source URLs1XX
20
- index and display contextual citations for quotes with sources1X
21
- This will require making the webservice scale well enough to process a high number of pages (all English articles)
2X
22
- This will serve as an advertisement for CiteIt, attracting interest for pubilshers to engage in Phase 4
23
- Scope:
24
- English 1
25
- Other languages: research quote separators needed for cross-language compatibility5XX
26
- Add Contextual Markup to MediaWiki editor2X
27
- Add workflow to edit and moderate existing citations
28
- allow existing citations to me mocked up and edited with link URL3X
29
- flag mistaken citations3X
30
- create view to aggregate and filter citations by citation type3X
31
32
33
B) Expand Webservice functionality to handle a very high percentage of quotation matching
34
- [c]apitalization2X
35
- [added] words2X
36
- elipses .. (expand/contract elided context)2X
37
- Allow unique specification if phrase occurs multiple times in a document.
2X
38
- Wikipedia articles data will serve as a benchmark of CiteIt's webservice's feature completeness and accuracy
39
40
C) Refactor Existing Codebase for robustness
41
- refactor existing Python webservice for robustness, currently uses Flask2X
42
- refactor existing frontend javascript, which currently uses jQuery2X
43
44
D) Quality Testing & Documentation
45
- define test cases: unmatched quote types2X
46
- document features and tests3X
47
- validate quote accuracy: create statistics to describe pcnt of wikipedia quotes that are perfect/fixable/wrong4XX
48
49
E) Save Citations to Database
50
- when a citation is published (indexed), save the quote and context to a Postgres database3X
51
- create interface to view listing of database citations4X
52
53
F) Provide support for Advanced features:
54
- PDF support: extract Text from "digital pdfs"2XX
55
- PDF support : OCR to convert scanned images to text3XX
56
- YouTube transcript integration with contextual popup3XX
57
- highlight current word of transcript as video plays5XX
58
- Archiving: to archive .org and/or zip file4X
59
- Deep Link "Read More" to the highlighted text of the specific quote fragement4XX
60
- Add support for Archive.org Wayback Machine2XX
61
- Add support for quoting Twitter2XX
62
- Add support for Archive.org ISBN Search3X
63
64
G) Other Features
65
- add support for other document formats: Word, Powerpoint, Image, Open Office5XX
66
- add support for meta-data as custom html attributes4XX
67
- add support for nested quotations5XX
68
- javascript implemenation of webservice indexing. ( Dave Winer?)5X
69
70
71
2
Scaling Up API
Stages 0,1Sloan Foundation ?
72
A) Wikipedia Scaling
73
- Modify Docker to allow non-AWS json publishingX
74
- Setup Self-hosted mysql/Postgres databaseXX
75
- Optimize Docker instanceXX
76
- Setup Docker clusterXX
77
- Add contextual citations to articles from x other languagesX
78
- Increase coverage of quote types to include more edge casesX
79
- publish Wikipedia citations using Docker webserviceX
80
81
B) Scaling up Archive.org
82
- Build out Archive.org clusterX
83
- Create API to programmatically retreive data via webserviceX
84
- Publish citation databaseX
85
- Load Wikipedia data into archive.org database via Wikipedia's public webserviceX
86
- Create brand: citeit.archive.org? (similar brand to wayback machine)X
87
88
3
Introduce Contextual Citations to Wikipedia editor
Stages 0,1Sloan Founding ?
89
- modify Wikipedia editor to enable input of URLX
90
- integrate editor with webserviceX
91
- setup webservice and databaseX
92
- testingX
93
- phased introductionX
94
95
4PublishersStages 0,1,2Knight Foundation ?
96
A) Research Publisher NeedsX
97
98
B) Alpha Test: Select Publishers
99
- help several publishers integrate their editorial tools with CiteIt APIX
100
- work with publishers to publish alpha citationsX