Request edit access
Validation Datasets for Innovation Research
The I3 is creating an index of datasets, code and methods used by innovation researchers to validate new research data and code. An example of such a project is Wetherbee et. al's U.S. Patent Phrase to Phrase Matching, a standalone validation dataset for benchmarking patent-phrase matching models. Other validation datasets are purpose-built for specific projects and released alongside the datasets they are used to verify; others are referenced in papers and documentation, but are not made public.

This project will index the validation datasets that are available (so researchers can make use of them), point to ones that have been made but are not currently public, outline various validation methods and tools, and also to index validation datasets that do not currently exist, but would be of use to the community. 
Sign in to Google to save your progress. Learn more
If you'd be happy for us to contact you further, please leave your email
What methods do you use to validate your research? Are there any that you would like to use but cannot due to cost/lack of data?
Are there validation datasets that you've created or know of that are publicly available?
Please provide links where possible
Have you used a validation dataset released by someone else in your research? How did you modify it to better suit your purposes?
Are there any validation datasets you know of that aren't public?
How did you hear about them, if they're yours, is there anything preventing you from publishing them?
Can you describe any validation datasets you don't think exist, but would be useful to your research?
Submit
Clear form
Never submit passwords through Google Forms.
This content is neither created nor endorsed by Google. - Terms of Service - Privacy Policy

Does this form look suspicious? Report