1 of 12

GLAMpipe [alpha]

Introduction

2 of 12

Me

  • Ari Häyrinen
    • Information system expert
    • Jyväskylä University Library
    • ari.hayrinen@gmail.com

3 of 12

GLAMpipe introduction

  • Graphical data tool for GLAM materials
    • extract, transform, load (ETL)
  • Data processing with human intervention
    • differs from processing scientific data
  • Extendable via nodes
    • tiny bits of javascript
  • Self-documenting
    • process is visible and re-usable
  • API for developers

4 of 12

GLAMpipe nodes

  • GLAMpipe is framework for nodes
  • All actions are wrapped in nodes
    • import, split, combine, replace, extract, download files, make lookups, etc.
  • Node makes one thing
    • for example split string
      • handy for CSV files where there are multiple values
  • Nodes do not manipulate imported data
    • node creates a new field
    • easy to try things without messing up data
  • New action = new node

5 of 12

GLAMpipe workflow

  • Import
    • csv or web
  • Process
    • split, combine, replace, extract, download files, make lookups
  • Export
    • csv or web

6 of 12

GLAMpipe installation

  • Can be used as a web service or local installation
    • local installation is needed for file uploads from local directory
    • local installation requires some tech skills
  • Web service under development
    • will be usable during WMF grant

7 of 12

Future

  • GLAMpipe is part of repository tools in JYU library
    • DSpace data maintenance, reference extraction, analytics
  • Development of core is part of my job
  • Work in progress!
  • Things will settle down during WMF grant
  • UI needs work&testing

8 of 12

GLAMpipe [alpha]

CASE

CASE: University of São Paulo Museum of Veterinary Anatomy (MAV)

9 of 12

Starting position

  • metadata in Excel sheet
  • jpg images
  • sets of 30 images

10 of 12

Tasks

  • creating filename with full file path
  • extracting description from spreadsheet
  • creating wikitext
    • adding specific license
    • adding categories
  • creating unique filenames
  • checking that files are not already in commons!

11 of 12

Tasks

  • creating filename with full file path
  • extracting description from spreadsheet
  • creating wikitext
    • adding specific license
    • adding categories
  • creating unique filenames
  • checking that files are not already in commons!

12 of 12

Result