The overall goals of this hackathon are to compare and improve open source tools and techniques for capturing label data off of biological specimen labels in scientific collections. We will compare existing OCR software output against a standard set of images and evaluate the current natural language processing algorithms used for parsing OCR output into standard darwin core output. Before applying, please take some time to understand the project by reviewing the call for participation document at http://tinyurl.com/aocrHack
Integrated Digitized Biocollections (iDigBio) https://www.idigbio.org/ is seeking a diverse group of people to participate in a process that will generate code, docs, examples, tests, and demos. Applications are due asap in order to begin forming teams. The number of participants is limited to 20, so apply soon.