Inventory of FLOSS in the Cultural Heritage Domain
NameDescriptionNeMO Activity TypesDevelopersProject websiteCode repositoryQuality of Documentation Ease of AdaptationCode QualityLicenseExamplesLast releaseLast activityCATEGORY 1CATEGORY 2CATEGORY 3CATEGORY 4CATEGORY 5
Getty VocabulariesThe AAT, TGN, ULAN, and CONA contain structured terminology for art and other material culture, archival materials, visual surrogates, and bibliographic materials. Compliant with international standards, they provide authoritative information for catalogers and researchers, and can be used to enhance access to databases and Web sites. - See more at: Open Data Commons Attribution Licensev3.16/5/2015Ontology/Vocabulary Building and ManagementLinked Open Data
pymarcPymarc is a python library for working with bibliographic data encoded in MARC21. It should work under python 2.x and 3.x. It provides an API for reading, writing and modifying MARC records. It was mostly designed to be an emergency eject seat, for getting your data assets out of MARC and into some kind of saner representation.Organizing, Bibliographic ManagementGabriel Farrell, Mark Matienzo, Geoffrey Spear, Ed Summers (27/07/2015)2015Data Annotation/Curation
GIMPGIMP is the GNU Image Manipulation Program. It is a freely distributed piece of software for such tasks as photo retouching image composition and image authoring.Visualizing, ImagingGIMP Team Annotation/Tagging
ArchivematicaArchivematica is a free and open-source digital preservation system that is designed to maintain standards-based, long-term access to collections of digital objects.PreservingArtefactual Systems Inc documentation, arranged by versions, divided between user and administration manual, well structured and informative, illustrated with screenshots that are aligned with the documentation text. Professionally created screencast gives a high level overview. Documentation hosted on Wiki.Microservices allow adapting workflows and easily integrating new services. Scale out is possible by adding new processing nodes. Service oriented, separation between ingest, storage, and access services, all have well defined APIs.Code ist hosted on Github. High code quality (Python 89.7% JavaScript 6.3% Shell 3.2% Other 0.9%), well structured, not many comments, no continuous integration, tests are available.AGPL3 license (user name:, password: demodemo)v1.5.0 (June 2016)actively ongoingDigital Preservation
ICA-AtoMICA-AtoM is a web-based archival description software that is based on the International Council on Archives ('ICA') standards. 'AtoM' is an acronym for 'Access to Memory'.
ICA-AtoM is multi-lingual and supports multi-repository collections.
Adding metainformation, Curating Artefactual Systems in collaboration with the ICA Program Commission (PCOM source code is maintained on GitHub where the majority of the additions already happend in Oktober 2012. A live demo of the software is available at and different types of manuals can be found here There there are approximately 250 institutions worldwide running ICA-AtoM. A list of them can be found here The code is maintained in GitHub. PHP is used as the main language. There are 8 contributor but only 4 of them are active. The last AtoM maintenance release (2.0.1) was on 16.12.13. GPL ongoingCollection ManagementRepository Software
AnnotoriousAnnotorious is a JavaScript annotation library for images and zoomable images. Link the Annotorious source files (CSS and JavaScript) into an existing HTML page, and images inside this page will be enhanced with an interactive drawing and commenting tool. Annotating, Commenting, VisualizingRainer Simon
Peter Pilgerstorfer Paul Weichhart
http://annotorious.github.io main website is very informative and it’s main menu includes links to: a demo page, a getting started page, a plugin page an API documentation page and finally an about page.There is a plugin development page available. In combination with a comprehensive GitHub page, it shouldn’t be too hard to develop plugins.The code is maintained in GitHub. The JavaScript (using Google Closure) code seems clearly structured and includes comments. There is a useful page with information on how the code was built and can be adapted.LGPL /
v0.6 (14/08/2013)actively ongoingMedia Annotation/Tagging
PyBossaPyBossa is an open source platform for crowd-sourcing online (volunteer) assistance to perform tasks that require human cognition, knowledge or intelligence (e.g. image classification, transcription, information, location etc).CrowdsourcingShuttleworth
http://www.pybossa.com Affero General Public Licensev0.2.2 (11/05/2015)2015Digital Asset ManagementMedia Annotation/TaggingSocial Applications
Open ExhibitsMulititouch and multiuser software CollaboratingIdeum BSD Licensev.3.0, 13 November, 2013Interactive User InterfaceExhibition Management
Europeanap-nerThis tool takes container documents (MPEG21-DIDL, METS), parses all references to ALTO files and tries to find named entities in the pages (with most models: Location, Person, Organisation, Misc). The aim is to keep the physical location on the page available through the whole process to be able to highlight the results in a viewer.ParsingKB Research Union Public License March 2014Metadata Retrieval ServicesSemantic Extraction
Omeka Contribution PluginMakes an Omeka site into one that accepts public contributions. The plugin provides a form to collect stories, images, or other files from the public and manages those contributions in your Omeka archive as items.PublishingCenter for History and New Media, George Mason University GPLv 3.0.1 August 2014Exhibition ManagementInteractive User InterfaceSocial Applications
UshahidiMake smart decisions with a data management system that rapidly collects
data from the crowd and visualizes what happened, when and where.
Managing, Browsing, VisualizingUshahidi LGPLv 3.0 beta 6Data VisualizationGeo-spatial ApllicationsSocial Applications
Omeka SA multisite reworking of Omeka on newer tech standards and aiming at interoperability with more systemsPublishingRoy Rosenzweig Center for History and New Media, George Mason University progressv 0.4.0-alpha Management
WorldCat Searchworldcat is a Python module that works with OCLC's WorldCat Affiliate web
services (see <>). worldcat currently works
with the WorldCat Search API, the xID (xISBN, xISSN, and xOCLCNUM) APIs, and
lookups using the WorldCat Registry API.
SeekingOCLC Lesser General Public License v2Unreleased prototypeSearch and Browsing
Cultural Enrichment Mashificator Collaboration tools.CollaboratingJeremy Ottevanger presentation of the idea together with a demo. There does not seem to be the intention to disseminate the code. No documentation available. Broken links.PHP integration documented.No code available UnreleasedContent Retrieval ServicesMetadata Retrival Services
Global References Index to Biodiversity (GRIB)It will be a tool to manage the taxonomic literature that is (a) already available in digital form, (b) in the process of being digitised, and (c) for which plans have been created for digitisation and to nominate literature to be digitised.Bibliographic Management Boris Jacob Retrieval Services
EuropeanaXMLBuilderA tool for downloading a full record descriptions in the ESE format from any OAI-PMH-compliant interface.Seeking,Managing, Resource sharingPCSS Digital Libraries Team LESSER GENERAL PUBLIC LICENSEUnknownMetadata Retrieval Services
datdat is an open source tool that enables the sharing of large datasets, allowing for a decentralized collaboration flowResource sharingMax Ogden main page embeds a very informative YouTube video on a talk the creator gives about the tool. The documentation on GitHub is extensive, well written and even illustrated. It gets you started and enthuses you to contribute.The code base is well documented and quite small. Adapting the system, e.g. to support new formats. Should not be too hard.The code is well documented, concise and straightforward to understand. In the code however, it appears there are no comments.BSD Licensestill pre-alphaMetadata Mapping/Conversion/normalisationCollection Management
Recline.jsA simple but powerful library for building data applications in pure Javascript and HTML.ProgrammingMax Ogden, Rufus Pollock license 2012actively ongoingInfrastructure
Image Similarity ClientImage similarity search source codeSeekingSergiu Gordia 2013Image Similarity
Timeline JSTimelineJS is an open-source tool that enables you to build visually-rich interactive timelines and is available in 40 languages.VisualizingNorthwestern University Knight Lab well written documentation for users. Also very well written documentation for programmers (found on the Github page). Frequent documentation updates and live demo.This is a web-application that can be pluged in on websites. Information for pluging in the webapp is available even on the first page of the project’s custom website. More extensive information exists on the GitHub page of the project. There is a list of known installations.Excellent code quality and package structure. The project is a work of academic level and follows almost every direction towards openness for collaboration and excplicity of descriptions.Mozilla Public License, v. 2.0 2013actively ongoingExhibition ManagementData Annotation/CurationMedia Applications
NeuraltalkNeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.Machine LearningStandford University License 2014Media ApplicationsMedia Checker/Validation
KorboKorbo is a Semantic Web basket manager. It allows users to search, import and augment Linked Data resources. Personal augmented collections created with Korbo, are then republished in the Linked Data cloud. Korbo is part of the Muruca suite.EnrichingNet7 yet production ready but prototype released 1/7/2012Semantic Enrichment
Delving Platform: CultureHub and SIP-CreatorDelving has developed and refined an open source platform with tools specifically tailored to the needs of the cultural heritage domain.

PreservingManuel Bernhardt, Gerald de Jong, Eric van der Meulen, Sjoerd Siebinga, Thomas Wikman, Juliane Stillerhttp://delving.eu documented and structured documentation. Extensible.Java/Scala based (Java 79.7% Scala 11.7% XSLT 5.1% JavaScript 3.1% Groovy 0.5%), hosted on Github,actively maintained, tests available, code documented, well written and structured.EUPL, Apache 2.0 cycleActive nowMetadata Mapping/Conversion/normalisationMetadata Checker/ValidationMetadata Retrival Services
Europeanap-dbpedia-disambiguationA simple Python library and webservice, that allows named entity disambiguation against a label database. The idea is to use a Solr query to filter possible candidates and use the more detailed analysis on string similarity, number of inlinks and entity type to select the "best" candidate. It contains code to handle (multi-lingual) DBpedia dumps and load them into a Solr backend. It also contains helper code for the annotation of ALTO 2.1 files that are used in the context of the Europeana Newspapers project.Named Entity RecognitionKBNL Research 2015Metadata Retrieval ServicesMedia Annotation/Tagging
ol3-cesiumOpenLayers - Cesium integration library. Create your map using OpenLayers 3, and visualize it on a globe with Cesium.VisualizingOpenLayers to review their license 2015Geo-spatial Apllications
Ocrad.jsOptical character recognition program hat can convert scanned images of text back into text.
Data RecognitionKevin Kwok GPL 2014Content Retrieval ServicesDigital Asset ManagementImage Similarity
Franken +The Initiative for Digital Humanities Media and Culture (IDHMC) at Texas A&M University as part of its Early Modern OCR Project (eMOP) has created a new tool called Franken+ that provides a way to create font training for the Tesseract OCR engine using page images. This is in contrast to Tesseract's document method of font training which involves using a word processing program with a modern font. ''''Franken+ works in conjunction with PRImA's Aletheia tool and allows users to easily and quickly identify one or more idealized forms of each glyph found on a set of page images. These identified forms are then used to generate a set of Franken-page images matching the page characteristics documented in Tesseract's training instructions but using a font used in an actual early modern printed document.Imaging, AnalysingTexas A&M/Bryan Tarpley, webpage has full explanationApache 2.0December 201311/1/2014Collection Management
NumishareNumishare is an open source suite of applications for managing digital cultural heritage artifacts, with a particular focus on coins and medals.ManagingEthan Gruber, American Numismatic Society documentation for Numishare is very minimal and not easy to find. There also appear to be no regular updates or an active forum for discussion.Apache License 2.0
December 201026/10/2012 - constant developmentCollection ManagementMetadata Mapping/Conversation/normalisationExhibition Management
ArchivesSpaceA next-generation archives management application that will incorporate the best features of Archivist’s Toolkit (AT) and Archon. The project team is developing a technical platform, governance structure, and service model that will provide the archival community with a cutting-edge, extensible, and sustainable platform for describing analog and born-digital archival materials. The ArchivesSpace product is being developed using an Agile scrum process, guided by a Product Vision for ArchivesSpace.Managing, Adding metainformation PreservingHudson Molonglowww.archivesspace.org documentation, well structured, providing many additional documentation ressources. Extensive documentation index.ArchivesSpace has a backand for the the major workflows, and a REST API and many interfaces for adapting the system. Code ist hosted on Github. High code quality ( Ruby 76.6% XSLT 8.9% JavaScript 7.7% CSS 6.0% Shell 0.8), well structured, not many comments, no continuous integration, tests are available.ECL 2.0ArchivesSpace (March 2014) on-goingCollection Management
mozjpegModern JPEG encoder designed to reduce the size and load time of webpages that carry a lot of pictures. It has been announced in March 2014 and is today supported by big Websites like Facebook and used in tools like ImageOptim. Mozjpeg supports optimized Huffmann table, custom quanitization matrices and modern techniques like trellis quantization while maintaining the same values for the structured similarity index (SSIM).EncodingJosh Aas,
Mozilla research (with Copyleft)5/18/2015 (v3.1)5/18/2015
PlumiPlumi is a free open-source video-sharing app based on PloneResource SharingEngageMedia in collaboration with Unweb.me Plumi pulls together a range of different products, different licenses apply to different elements of the software. However most are covered either by the GNU GPL or the Zope Public License
4.5.2 (june 2015)2015Exhibition ManagementMedia Applications
Cross-Platform Authentication - Authorization ProviderHybrid media devices, which can deliver audio, video and interactive content over both broadcast and broadband, create new opportunities and
challenges for broadcasters. Augmenting the broadcast experience with interactive content delivered over the Internet changes the classic one-to-many paradigm bringing it closer to a one-to-one relationship.
Cross-Platform Authentication (CPA) offers an open standard for associating any media device with an online identity, which facilitates delivery of personalized services to these devices.
31/7/2014Media Applications
PDFMinerPDFMiner is a tool for extracting information from PDF documents.Machine Learning, Extracting DataYusuke Shinyama License - initial release5/4/2015Content Retrieval ServicesSemantic Extraction
MediathreadMediathread is a Django site for multimedia annotations facilitating
collaboration on video and image analysis. Developed at the Columbia
Center for New Media Teaching and Learning (CCNMTL)
Annotating, Collaborating, AnalyzingColumbia Center for New Media Teaching and Learning GPL31/1/2015Media Annotation/TaggingInteractive User Interface
BitCurator AccessBitCurator Access software tools will assist collecting institutions
(libraries, archives, and museums) in providing web-based and local
access to born-digital materials held on disk images. BitCurator Access
will focus on software that simplifies access to raw and
forensically-packaged disk images, allowing collecting institutions to
incorporate these objects into access environments in a manner that
reflects the original order and relevant environmental context. The use
of open source digital forensics software will allow for detailed
analysis of file and file system provenance, quality and accessibility
of files, metadata in files and the file system, and residual (non-file
system) data contained within disk images.
Direct accessing, Analyzing, Access Management, Adding metainformationUniversity of North Carolina at Chapel Hill v 330/12/2014Collection Management
Dédalo: Intangible Heritage management and Oral HistorySemantic RDF data sourcesAnnotating, ManagingJuan Franciso Onielfa, Alejandro Peñahttp://www.fmomo.org only available in Spanish, PDF documents.No documentation available.Web application (HTML, CSS, Javascript, PHP, MySQL). Download only possible after registration, code not available.GNU GPL v3 2012Collection ManagementContent Retrival Services
SpiraSpira is a framework for using the information in RDF.rb repositories as model
objects. It gives you the ability to work in a resource-oriented way without
losing access to statement-oriented nature of linked data, if you so choose.
It can be used either to access existing RDF data in a resource-oriented way,
or to create a new store of RDF data based on simple defaults.
ManagingRubyGems 28/1/2014Linked Open Data
Shred.jsJavascript framework to enable annotating of diverse media from diverse sourcesAnnotatingColumbia Center for New Media Teaching and Learning Annotation/Tagging
Open SKOS Client RubyA Ruby client for searching and retrieving SKOS concepts from an OpenSKOS instance over its RESTful APIRetrieving Europeana Union Public License 1.125/6/2014Linked Open Data
KriKriA Rails engine for metadata aggregation, enhancement, and quality control.Adding metainformationDPLA Retrieval Services
TesseractTesseract is probably the most accurate open source OCR engine available. Combined with the Leptonica Image Processing Library it can read a wide variety of image formats and convert them to text in over 60 languages.ConversioningRay Smith License 2.023/10/2012Collection ManagementDigital Asset ManagementOntology/Vocabulary Building and ManagementDigital Preservation
PencilcasePencil Case is an appweb which offers tools for designers & developers. It showcases over 750 resources in design, development, learning, productivity, collaboration, publishing, testing, and more - all tracked by popularity in realtime.SeekingPencil Case (terms & guidelines)Copyright 2015Collection ManagementDigital Asset Management
ActiveTriplesAn ActiveModel-like interface for RDF data. Models graphs as
Resources with property/attribute configuration, accessors, and other methods to support Linked Data in a Ruby/Rails enviornment.

This library was extracted from work on ActiveFedora. It is closely related to (and borrows some syntax from) Spira, but does some important things differently.
Modifying, Categorizing BrowsingActiveTriples 2.022/8/2014Linked Open Data
HeidrunHeiðrún (a.k.a. Heidrun, pronounced [roughly] hey-droon) is
DPLA's new metadata aggregation system, which we use to harvest
metadata from Hubs, map it to the DPLA Metadata Application Profile,
enrich it to clean up and add value, and to index it for use in the DPLA
Platform API. Heiðrún is implemented as a Ruby on Rails application
that builds on Krikri, a
Ruby gem for metadata harvesting, mapping, and enrichment. Heiðrún and
Krikri are both released as open source software under the MIT License.
Gathering, Adding MetainformationDPLA on the websiteMIT22/2/20157/1/2015Ingestion Tool
BRAT Rapid Annotation Toolonline environment for collaborative text annotation"; focused on structured annotation of text, e.g., tagging named entities such as persons, organizations, etc., and their relationshipsAnnotationUniversity of Tokyo Annotation/Curation
pdfhtmlEXpdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies, aims to provide an accuracy rendering, while keeping optimized for Web display. ConversioningLu Wang
22/07/201522/07/2015Semantic ExtractionPublication
DSpaceDSpace open source software is a turnkey repository application PreservingDSpace Foundation high quality software documentation, commercial support.Complex product ecosystem.Maven-based, highly modularized, tests available, clean code, mainly java, many comments explaining concepts and supporting readability. Good object oriented design.Open Source: ManagementMetadata Retrival ServicesRepository Software
MFCS Metadata Form Creation SystemThe Metadata Form Creation System (MFCS) is WVU Libraries answer for providing an easy to use interface for librarians, staff, and students for entering metadata and uploading digital items for our digital collections. MFCS is also our archival and preservation system. MFCS is a delivery and repository agnostic system. Processing, Archiving, PreservingMichael Bond WVU Libraries Open Source License ManagementDigital Asset ManagementDigital Preservation
TelemetaTelemeta is a free and open source web audio archiving software which introduces useful and secure methods to organize, backup, index, transcode, analyse, share and publish any digitalized audio or video file with extensive metadata in accordance with open web standards. It is dedicated to collaborative media archiving projects, research laboratories, librairies and digital humanities.Archiving, Indexing, Archiving, Resource Sharing, AnalyzingGuillaume Pellerin, Thomas Fillonhttp://telemeta.org (GPLv2 compatible) ManagementDigital PreservationMedia Annotation/TaggingMetadata Mapping/Conversation/normalisationInteractive User Interface
AmaraAmara gives individuals, communities, and larger organizations the power to overcome accessibility and language barriers for online video. Amara is composed of three main parts: A subtitle creation and viewing tool (aka the widget)A collaborative subtitling websiteAn open protocol for subtitle search/deliveryProducing, Transcribing, Collaborating, Retrieving, SubtitlingParticipatory Culture Foundation GPL2/6/4/2014Media Applications
Pallete-serverpalette-server is a small Flask based HTTP-pony to extract colours from an image.Data Recognition, Extracting DataCooper Hewitt Retrieval ServicesSearch and BrowsingImage Search
OmekaA Collection/Exhibition Management SystemPublishingRoy Rosenzweig Center for History and New Media, George Mason University GPL2.4.12016-05-25Exhibition ManagementCollection Management
PunditPundit is a semantic web annotation tool. It allows user to create structured data in their annotations by creating semantic relations between different kind of items, being them portions of texts in a web page, images, Linked Data entities or entries from a custom vocabulary. Annotations can be private or public and can be consumed by external applications via REST API.AnnotatingNet7http://thepund.it Code is divided into Pundit Server Code and Pundit Client code ( beta2015Semantic EnrichmentData Annotation/CurationLinked Open Data
Fedora MigrateMigrates content from a Fedora3 repository to a Fedora4 one.MigratingPenn State Management
SubjectsPlusSubjectsPlus is a free and open source tool to help you manage several interrelated parts of your library website.ManagingJoyner Library East Carolina University/ University of Miami Libraries GPL19/12/2014Collection ManagementDigital Asset Management
Clipper Prototype 3Clipper is a free open-source web application enabling researchers to create and share virtual-clips without altering the original media files. Clipper enables you to mark the start and end of interesting events while playing audio or video data files through a standard web browser. You can add rich text annotations to each clip, and combine clips into playlists (cliplists)AnnotatingThe City of Glasgow College, The Open University and Reachwill Ltd Annotation/Tagging
StacklifeStackLife is a community-based wayfinding tool for navigating the vast
resources of the combined Harvard Library System. It enables
researchers, teachers, scholars, and students to find what they need and
help others learn from them and their paths.
BrowsingHarvard Library Innovation Lab Management
DocSplitDocsplit is a command-line utility and Ruby library for splitting apart documents into their component parts: searchable UTF-8 plain text, page images or thumbnails in any format, PDFs, single pages, and document metadata (title, author, number of pages...)ModifyingJeremy Ashkenas, DocumentCloud and concise documentation, well written and structured, easily comprehensible.Project fulfills specific purpose well, adaption is easily possible by adapting ruby scripts.Pure Ruby project. Clean code, tests available, many comments support readability of code. Actively maintained, many contributers, code hosted on github.LGPL17/11/210145/2/2015Collection Management
ElasticSearchElasticSearch is a distributed RESTful search engine built for the cloud.Seeking user friendly website with extremely thorough documentation. Training, development support, and production support are all available. Github page with notes and directions. Blog is updated regularly and is up-to-date. There are also very interesting and helpful case studies available to see how groups have made use of ElasticSearch.
One downside of the documentation is that it’s sometimes hard to find concrete examples of every day use cases.
ElasticSearch has many configuration options for building the search engine needed for your use case. Extending the tool itself is possible by contributing to GitHub, but isn’t probably something you would need to do.The (mainly Java) code seems well documented and commented. Apache License, Version 2.016/7/201524/7/2015Search and BrowsingContent Retrival ServicesMetadata Retrival ServicesCollection Management
EvergreenEvergreen is a highly-scalable software for libraries that helps library patrons find library materials, and helps libraries manage, catalog, and circulate those materials, no matter how large or complex the libraries.Seeking, Managing, Resource sharingVarious clear and extensive documentation. Updated regularly. There is a documentation interest group that meets periodically and the minutes are recorded. The meetings are open for anyone to attend. While the documentation is very clear and organized some may find its vastness overwhelming.Evergreen is a huge collection of software packages. To adapt it means a lot of reading, installing and testing. However since the documentation is very thorough and extensive, it should be possible.The reviewer did not inspect any code, because of the size of the codebase and the several different tools that are available.GNU GPL16/6/20154/9/2011Collection ManagementSearch and Browsing
IslandoraIslandora is an open source framework that combines the Drupal and Fedora open software applications to create a robust digital asset management system that can be fitted to meet the short and long term collaborative requirements of digital data stewardship. Additional open source applications are added to this core stack to create what we call Solution Packs.Managing, Collaborating, StoringThe Islandora Foundation Wiki is set up to provide documentation seems to be a very active user group:!forum/islandora-dev and there are 63 installations listed: Github: hosts all the repositories for Islandora. All of the 58 repositories are public. The code is written mainly in PHP and JavaScript. GNU-GPL ongoingDigital Asset Management
MiradorAn open-source, web-based 'multi-up' viewer that supports zoom-pan-rotate functionality, ability to display/compare simple images, and images with annotationsPresentingStanford University quality of the documentation for developers is excellent. Everything you need can be found here: There is no tutorial for users available but this is also not necessary.Mirador can connect to repositories that provide a IIIF-compliant Metadata API. The project started in August 2013. At the moment there are 4 contributors. JavaScript is the language of choice. A “first-time-setup” is quite easy to do. Just install GRUNT ( beforehand. On the command line, in the mirador folder type “grunt server” and open in your browser http://localhost:8000.Apache License, Version 2.0
Media ApplicationsMedia Annotation/Tagging
Fedora CommonsFedora (Flexible Extensible Digital Object Repository Architecture) was originally developed by researchers at Cornell University as an architecture for storing, managing, and accessing digital content in the form of digital objects inspired by the Kahn and Wilensky Framework. Fedora defines a set of abstractions for expressing digital objects, asserting relationships among digital objects, and linking "behaviors" (i.e., services) to digital objects. Managing, Storing, Direct Accessing various researchers at Cornell University is not visibly and clearly labeled but when found it is very thorough and fairly organized. The documentation is hard to navigate and not as clear as some others. The project is on going and a new version will be coming out. Explanations about upgrading are provided. No visible community or forum space.Apache License, Version 2.0. ongoingDigital Asset ManagementContent Retrival ServicesCollection Management
UniversalViewerThe Universal Viewer is an open source project to enable cultural heritage
institutions to present their digital artifacts in an IIIF-compliant and highly customisable user interface
Visualizing, PublishingEdward Silverton thorough documentation found on the wiki License Media ApplicationsDigital Asset ManagementPublication
TAL (TV Application Layer)The TV Application Layer (TAL) is an open source library for building applications for Connected TV devices.ProgrammingBBC Future Media Platforms 2.0
14/3/2015Media Applications
Europeana ClientJava client for the Europeana Search API. Refactored and Mavenized version of Europeana4jSeekingSergiu Gordia Retrieval ServicesLinked Open Data
ol3A high-performance, feature-packed library for all your mapping needsVisualizingOpenLayers 2.0, BSD, MIT Apllications
File Rename Tool (FRT) The File Rename Tool (FRT):
Deliveries dates are obviously available on newspapers so that they have the
possibility to be saerched for by data later on. If a newspaper is not available
in day folders, the ‘File Renaming Tool’ can help to bring them into the
right structure and support libraries in renaming and reordering their
images according to the Europeana Newspapers project specifications.
The main idea of FRT is that images, which may be stored on year level,
can be quickly ordered on the basis of issues and publishing date.
Modifying, ManagingUniversity of Innsbruck Retrieval ServicesDigital Asset Management
CKANCKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data.Managing, Publishing, Resource SharingOpen Knowledge Foundation quality of the website and the available documentation is very good. The purpose and features, including the API, of CKAN are all documented very well.There is dedicated documentation on the writing of extensions. Moreover there are 60 external extensions available, which can serve as an example. The code base is large, but the documentation is good, so writing these extensions shouldn’t be too hard.The Pyton code looks good, there are not many comments, but for the important objects (within the comments) references to the API docs are given. The API docs are very detailed.Affero GNU GPL v3.0 Management
OpenSeadragonAn open-source, web-based viewer for zoomable images, implemented in pure JavaScript.Editing, Browsing quality of the documentation for developers is excellent. Everything you need can be found here: There is no tutorial for users available but this is also not necessary.Plugins can be used to displays your image's scale in real-world measurements, enhance OpenSeadragon, provide coordinate conversion, pan, and zoom methods in a simplified coordinate system and provide hooks into an OpenSeadragon.Viewer and/or OpenSeadragon.MouseTracker for overriding/extending the default user-input event handling behavior. An issue tracker on GitHub is found here: project started in January 2013. At the moment there are 23 contributors. JavaScript is the language of choice. A “first-time-setup” is quite easy to do. Just install GRUNT ( beforehand. On the command line, in the openseadragon folder type “grunt connect watch” and open in your browser http://localhost:8000/test/demo/basic.html.BSD license Applications
Avalon Media SystemThe Avalon Media System is an open source system for managing large collections of digital audio and video filesManaging, ArchivingIndiana University and Northwestern Universityhttp://www.avalonmediasystem.org 2.0 Management
CatmanduCatmandu provides a suite of Perl modules to ease the import, storage, retrieval, export and transformation of metadata records. PreservingNicolas Steenlant, Patrick Hochstenbach main site offers a brief introduction of the capabilities of Catmandu and offers an extensive tutorial. The GitHub code repository also contains developer documentation, but needs to be generated (or you can find it by browsing the code repository) after downloading. The developer documentation seems well written and quite extensive.The code repository in GitHub also includes a page directed to contributers, but does not seem to desribe any plugin architecture. Since the code seems professionally maintained and fairly well documented, adding customizations to the code seems worth considering.The Perl code looks well structured, well written, and contains comments in the form of perldoc. There is a long list of tests available. Also the code is included in CPAN (a repository of Perl libraries).GPL-2, or later11/2013ongoingMetadata Retrieval ServicesMetadata Mapping/Conversation/normalisationCollection Management
CollectionSpaceCollectionSpace is an open-source collections management application that meets the needs of museums, historical societies, and other collection-holding organizations. CollectionSpace is designed to be configurable to each organization’s needs, serving as a gateway to digital and physical assets across an institution. The software is freely distributed via open-source licensing, and an active developer community ensures that CollectionSpace is continually improving. ManagingJanuary 2014: Lyrasis is now the organizational home of CollectionSpace. (The project was initiated and led by Museum of the Moving Image. Transition team includes developers based at University of California, Berkeley and Fluid Project at Ontario College of Art and Design; and Jesse Martinez, Freelance developer and Service Provider.) Project Partners 2008-2013 included: University of California Berkeley, Fluid Project at Ontario College of Art and Design, and Centre for Applied Research in Educational Technologies (CARET) University of Cambridge. home page clearly provides a link to the documentation, which is very thorough and covers the most relevant topics such as: system requirements, how to configure & install it, how to use and maintain it and also how to develop customizations. The overall quality of this documentation is fairly good.CollectionSpace describes having a so-called hook system in place (which is used in e.g. Drupal or Wordpress), a design pattern to conveniently extend a system. Hook systems in general take some time to learn and in this case specific documentation on the hooks are yet to be added (some limited examples are available though).The different parts, namely the UI, services, application, tools, etc, are in different sections in GitHub, making the distinction clear. Each of these sections is sparsely documented, however the code looks good with JavaDoc and maven pom.xml files. ECL 2.0www.demo.collectionspace.org10/2013on-goingCollection Management
Razuna DAMOpen source digital asset managementManagingRazuna Affero Public License v.3 or later10/2013Digital Asset Management
DBpedia SpotlightDBpedia Spotlight is a tool for automatically annotating mentions
of DBpedia resources in text, providing a solution for linking
unstructured information sources to the Linked Open Data cloud through
Annotating, LinkingPablo Mendes (Freie Universität Berlin), Jun 2010-present.
Jo Daiber (Charles University in Prague), Mar 2011-present.
Prof. Dr. Chris Bizer (Freie Universität Berlin), supervisor, Jun 2010-present. documented, short documentation on Github, further documentation on Wiki, well structured and written.REST Webservices allow integration and adaption.Java/Scala based project (Java 48.7% Scala 48.1% Shell 1.7% Python 1.5%), actively maintained, last commit days ago, continous integration, well-structured, maven-based build, tests available. Many branches and contributors.Apache License, 2.0, LingPipe10/2012Linked Open DataData Annotation/Curation
ExifToolExifTool is a platform-independent Perl library plus a command-line application for reading, writing and editing meta information in a wide variety of files.Adding MetainformationPhil Harvey GNU General Public License, v.1 or later10.20 (June 13, 2016)Metadata Retrival ServicesMetadata Mapping/Conversation/normalisationMetadata Checker/Validation
Collective AccessThis is one of the more powerful open source CMS (collection managemetn systems). It allows you to manage metadata in variouos formats, using metadata profile descriptions. Available profiles include VRAcore, EBUcore, LIDO (contributed by LIBIS), etcManaging, Presenting, Adding meta-informationWhirl-i-Gig documentation available on Wiki, well structured. Demo, Installation instructions, Upgrade instructionsPHP Web Application, easily extensible. No standards. Documentation for adaption, API documentation available.Mainly PHP-based (PHP 86.7% JavaScript 10.8% CSS 1.9% Other 0.6%). Code hosted on Github. Highly active, last commit 2 days old. Code is well written and documented. Separation into modules, clean code layout. Bugtracker available.GNU GPL v.2 ManagementContent Retrival ServicesExhibition Management
FixityFixity is a utility for the documentation and regular review of stored files.ManagingAudioVisual Preservation Solutions License, Version 2.0.1/13/2014, v.0.3Metadata Mapping/Conversion/normalisationMedia Checker/Validation
Question2AnswerA Q2A site helps your online community to share knowledge. People with questions get the answers they need. The community is enriched by commenting, voting, notifications, points and rankings.ConsultingGideon Greenspan GPL v2 (27/07/2015)2015Collection Management
Popcorn.jsPopcorn.js is a Mozilla's HTML5 video and media library for the open web. It allows web developers, filmmakers, artists, designers and others to easily create timeline based web productions. Popcorn.js helps simplify media API and implementation differences between browsers and includes a powerful event system and a rich plugin architecture and plugins.Web-developing Mozilla License Media Applications
JHOVEJHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects.Processing, PreservingJSTOR & the Harvard University Library
Documentation is at (May 12, 2016)ongoingDigital PreservationMedia Annotation/Tagging
TemaTresTemaTres is an open source vocabulary server, web application to manage and exploit vocabularies, thesauri, taxonomies and formal representations of knowledge.Adding MetainformationDiego Ferreyra license Building and ManagementCollection Management
MediaInfoA convenient unified display of the most relevant technical and tag data for video and audio files.PresentingMediaArea software is very easy to use. Documentations is not locatable but also not necessary.Bug reports ( and feature requests ( can be filed and there is also a forum for all other questions ( They are all very active.The code is hosted on sourceforge and there seems to be development ongoing. Binaries for nearly all operating systems are available: license0.7.86 (May 31, 2016)actively ongoingOntology/Vocabulary Building and ManagementMedia Applications
EADitorEADitor is an EAD (Encoded Archival Description) editor based on Orbeon XForms. It uses various external services (eg Getty AAT, TGN, ULAN) for accessing LOD to be used in the descriptions. A companion tool is xEAC for creating and managing EAC-CPF records (corporates, persons, families)ManagingEthan Gruber, American Numismatic Society documentation available on Github seems a bit like a copy& paste text blog that is not structured. But there is a lot of information available and blog posts explain the concepts of the software and give further examples.REST or SOAP interfaces allow easy integration. No plugin mechanism.CSS/XSLT/Java-Script-based project hosted on Github (CSS 62.3% XSLT 17.3% JavaScript 15.0% XProc 5.4%). One contributor (highly active Github committer), the code is being actively maintained. A lot of XSLT is not commented but readable.Apache License 2.0 beta (December 2011)June 2012Collection ManagementMetadata Retrival Services
CartoCARTO is an open, powerful, and intuitive platform for discovering and
predicting the key insights underlying the location data in our world.
Imaging, GeoreferencingCARTO Apllications
Kamailio is an industrial-strength, free server for realtime communication, based on the Session Initiation ProtocolCommunicatingKamailio GPL v2.010/2/2015SIP
HydraHydra is a repository solution that is being used by institutions worldwide to provide access to their digital content. Hydra provides a versatile and feature rich environment for end-users and repository administrators alike.Curating, Managing, Preservation GoodApache 2 license PreservationDigital Asset Management
HyperImageThe HyperImage platform supports the linking of (audio)-visual objects,
texts and mixed-media documents. HyperImage allows any number of
details, or subregions, within an image to be highlighted and described,
and for annotations within a corpus to be linked to each other, making
them accessible in indices. Interim results as well as final versions
can be compiled at any time as an online/offline hypermedia publication.
This makes HyperImage a suitable research environment for digital
humanities and eScience projects, providing a common research and
publication environment for groups as well as individuals.
Annotating, Analyzing,

Heinz-Günter Kuper, Dr.,JML
Jens-Martin Loebel, Dr. Apache 2 license Annotation/Tagging
GATE (General architecture for text engineering)GATE is over 15 years old and is in active use for all types of computational task involving human language. GATE solves problems concerning text analysis or human language processing.Processing, Parsing, Name entity recognitionVarious LESSER GENERAL PUBLIC LICENSE v.3.0 ongoingAlignment ToolsMetadata Mapping/Conversation/normalisationMedia Checker/Validation
FromThePageFromThePage is an open-source tool that allows volunteers to collaborate to transcribe handwritten documents.Collaborating, TranscribingBen W. Brumfield FromThePage home page provides very little documentation and the Github page also has very little documentation. The documentation that is there is clear and visible but severely lacking in detail.GNU AGPL v3.06/1/20156/1/2015Social ApplicationsContextualisation
Annotation Studio An online annotation platform for teaching and learning in the humanitiesannotating MIT GPL Annotation/Tagging
Active_fedoraRubydora and ActiveFedora provide a set of Ruby gems for creating and
managing objects in the Fedora Repository Architecture
( ActiveFedora
is loosely based on “ActiveRecord” in Rails. The 3.x series of
ActiveFedora depends on Rails 3, specifically activemodel and
PreservingProject Hydra 2.05/2/2015Collection Management
Kuali OLEKuali OLE is the first system designed by and for academic and research libraries for managing and delivering intellectual information. A community of partners will deliver an enterprise-ready,
community-source software package to manage and provide access not only
to items in their collections but also to licensed and local digital
content. Kuali OLE (pronounced oh-LAY, for Open Library Environment)
features a governance model in which the entire library community can
collaborate to own the resulting intellectual property.
Managing, accessingKuali OLE 2.05/1/2015on-goingCollection Management
Diva.jsDiva.js (Document Image Viewer with AJAX) is a
Javascript frontend for viewing documents, designed to work with digital
libraries to present multi-page documents as a single, continuous item.
Only the pages that are being viewed at any given time are actually
present in the document, with the rest appended as necessary, ensuring
efficient memory usage and high loading speeds. Written as a jQuery plugin, diva.js requires the jQuery Javascript library. Diva's back end is constituted by the IIPImage server.

Modifying, BrowsingDistributed Digital Music Archives and Libraries w/ attribution3/6/201510/6/2015PublicationExhibition Management
sigilSigil is a free, open source, multi-platform e-book editor, designed for editing books in EPUB format. EditingStrahinja Marković, John Schember GPL v33/2/2015Publication
CesiumCesium is a JavaScript library for creating 3D globes and 2D maps in a web browser without a plugin. It uses WebGL for hardware-accelerated graphics, and is cross-platform, cross-browser, and tuned for dynamic-data visualization.ImagingAGI 2.0 Apllications
FrogFrog formerly known as Tadpole is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl the Tilburg memory-based learning software package. Most modules were created in the 1990s at the ILK Research Group (Tilburg University the Netherlands) and the CLiPS Research Centre (University of Antwerp Belgium). Over the years they have been integrated into a single text processing tool. More recently a dependency parser a base phrase chunker and a named-entity recognizer module were added. Processing, Parsing, Name entity recognitionUniversity of Tilburg install & run, but "we are in the process of writing a reference guide for Frog that explains all options in detail."limited. no guidelines how to add other languagesGNU GPL1/29/20151/29/2015Semantic Extraction
Binarization and Conversion Tool

The BCT tool can be used to produce JPEG2000 or JPEG files of newspaper master images for presentation on the web. This tool calls two other tools, a binarization method from Basilis Gatos which is optimised for OCR, and Kakadu, a software development kit for creating JPEG2000 images. Therefore, in order to fully use the features of BCT, both tools must be installed and licensed. However, feel free to call your own tools from BCT. Imaging, CompressingUniversity of Innsbruck SearchMedia Checker/Validation
D3D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation. Managing, VisualizingDustin Ewers License Visualization