1 of 16

Canberra Workshop

July 2014

2 of 16

Agenda

  • Architectural overview
  • Collectory
  • Biocache
  • Hubs
  • Cassandra
  • SOLR and Lucene

3 of 16

4 of 16

5 of 16

ALA Demo

  • Focussing on subset of tools
    • Collectory
    • Biocache services
    • Biocache command line tool
    • Hubs

6 of 16

7 of 16

8 of 16

Collectory

  • Data resources
  • Collection
  • Institution
  • Data provider
  • Data hub

9 of 16

Collectory - data mapping

10 of 16

Collectory

  • Grails app
  • MySQL database
  • External configuration file

11 of 16

Biocache services

  • Restful JSON
  • Web Mapping Services (WMS)
  • Downloads
  • Java Spring MVC webapp

12 of 16

Biocache command line tool

  • Java executable
    • Scala implementation
    • External configuration file
    • Vocabulary files
  • Loading, processing, sampling, indexing
  • Other bits
    • Duplicate detection
    • Outlier detection

13 of 16

Hubs

  • Occurrence searching & mapping
  • Grails app
  • Plugin Architecture
    • biocache-hubs = the plugin
    • generic-hub = the one to fork for your project
  • Internationalisation support

14 of 16

Apache Cassandra

  • Used for occurrence record storage
  • NoSQL database
  • Supports simple operations
    • Get by ID
    • Put by ID
    • Range query
  • Extensible storage

15 of 16

16 of 16

SOLR & Lucene

  • Lucene index
    • name matching
    • generate from DWCA
  • SOLR
    • occurrence searching
    • mapping
    • update live & offline creation