1 of 35

Treasures Unlocked

Project Update for 2017 BHL Tech Team Meeting

Ariadne Rehbein

Missouri Botanical Garden

2 of 35

Project requirements

The Treasures Unlocked project would allow BHL to take the next step in image discovery – further development of the BHL portal user interface in order to allow both searching and browsing of images…

Deliverables:

  • Image users’ needs assessment
  • Recommendations for enhancements to BHL portal
  • Proof of concept demo for 1-2 recommendations for enhancements to BHL portal (dependent upon the skills of the Resident.)
  • Best practice guidelines on image discovery

3 of 35

BHL Strategic Plan

Goal 3: User Engagement: Increase global awareness about the BHL and biodiversity issues through outreach, learning and education, branding, and collaboration with existing and new user communities.

  • Increase Awareness about BHL and Grow our Audience
  • Keep Audiences Informed about Project Developments and Services
  • Foster Dialogue among Audiences about Biodiversity Topics and BHL Collections.

4 of 35

Demographics and Development

  • 2009: “Scientists make up the primary audience for the first release of the BHL. It is likely that audiences will be expanded through social networking tools and repurposing content for new audiences...The BHL was developed primarily for scientists in partnership with the Encyclopedia of Life, but the audience will broaden as more tools become available.” -- Gwinn and Rinaldo
  • 2012: “Reaching new audiences and demonstrating that BHL has much to offer outside the realm of taxonomy is a critical step in the growth of the project.” -- Costantino
  • 2015: “Audiences for this project fall into two primary groups - image content producers and image content consumers.” -- Rose-Sandler
  • 2017: To collect non-book visual resources, “we need to study their value to those studying biodiversity.” -- NDSR mentor/ resident discussion

5 of 35

Applying Frameworks

  • “It's your role as a requirements analyst to dig in and understand what the underlying need is, and ensure the solution is complete.”
  • “internally framing our projects as experiments rather than as complete systems… We will also learn about how to best build these projects.”

Sources: Angela Wick, “Requirements Elicitation for Business Analysis: Interviews” on Lynda.com; Gibson, Sherer, and and Gibson “How to do systems analysis” textbook; Chris Lintott, Zooniverse, CrowdCon 2015 proceedings p. 41 http://www.crowdconsortium.org/wp-content/uploads/crowdconP.pdf

6 of 35

Investigation

7 of 35

UX Research Dimensions

Sources: Janelle Estes, “Intro to UX: Conducting Smart User Research” on Skillshare;

Prof. Kentaro Toyama “Understanding User Needs” on EdX

8 of 35

Source: Prof. Mark Newman, “Introduction to User Experience” on EdX

9 of 35

Source: Prof. Mark Newman, “Introduction to User Experience” on EdX

10 of 35

Source: Prof. Mark Newman, “Introduction to User Experience” on EdX

11 of 35

Taxonomists and Historians of Science

12 of 35

Methodology

Customer research framework Source: Doug Winnie,

“Product Management Foundations” on Lynda.com

Quantitative

Qualitative

Internal data (to BHL or MoBot)

  • User stories Blog post analysis (counts)
  • Pam’s survey
  • 2010 BHL survey
  • User stories Blog post analysis (themes)
  • MoBot Research Staff interviews

External data

*No outside source quantifies illustration use among taxonomists

  • Nomenclatural publishing requirements regarding illustrations
  • “Descriptive taxonomy: The Foundation of Biodiversity Research” eds. Watson, Lyal, Pendry
  • Examples of specimen and observation data management in relationship to citizen science

13 of 35

Previous studies

  • Kelli’s blog post analysis: 16% downloaded images (vs. PDFs)
  • 2010 survey: Finding illustrations

14 of 35

Guiding Questions

Kelli Trei

  • A deeper analysis of product creation and support using BHL data
  • “What impact, if any, does layperson use of the BHL have on science?”
  • Are other populations I’m focusing on using BHL illustrations? What can we find out about this?
  • Why are they downloading them?

2010 survey

  • Who is considered a general interest reader (those who have significantly more interest in illustrations?)
  • Is there something to the plate issue?
  • Why are they finding them?

15 of 35

Quantitative research

  • Blog post analysis

Pam’s Survey

  • Mention illustrations:
  • 36% Scientist (21/ 55)
  • 30% Historian (3/10)
  • 80% Amateur Scientist (4/5)

Do you use illustrations/ images in BHL?

  • 59% Scientist / Taxonomist (165/278)
  • 55% Historian (16/29)
  • 49% Citizen Scientist/ Hobbyist (25/51)

16 of 35

Taxonomists

  • 10 the emotional value provided by illustrations
  • 5 missing plate numbers and issues finding high resolution download option
  • 5 “for reference” (some related to publication)
  • 1 fundraising
  • 1 updating historic species description

17 of 35

MoBot interviews

  • Garden-wide email, Research listserv email, organizational contacts
  • 5 interviewees
  • Illustrations of type specimens required for IDing specimens
  • Citation or inclusion of published photos or illustrations for publications
  • Related illustration can help with unclear protologue
  • Help understand historic conceptualizations (species names and places)
  • Assignment of lectotype in lieu of physical specimen as type

18 of 35

Historians of Science

  • 3/10 posts
  • Focused on sharing illustrations, if not officially part of their work (exhibitions, “thinking about how people access our specimens and content, personal Twitter and blog)
  • Refer to for history of specimen collections/curators
  • Like taxonomists, drawn to illustrations in their area of work

19 of 35

Citizen Scientists

4/5 posts

  • Taxonomic expertise, passion
  • Sharing illustrations through their own work is a priority: resharing illustration with context on a blog and through wikipedia.
  • Among the 2 whose focus in citizen science isn’t sharing illustrations, they note that illustrations valued for their beauty and scientific value.

Major citizen science focus is observation and specimen labeling, but focused on photos

20 of 35

Product creation

  1. Overcoming difficulties inherent to the literature or to the metadata once online
  2. Checklists/ Resources created by taxonomists with dual purpose of conservation
  3. References for species that link back to BHL literature (and have varying rigor through reference to specimens, taxonomic trees, authors)
  4. Blogs and campaigns

21 of 35

Crowdsourcing Analysis

22 of 35

Methodology

  • Invited most prolific participants on BHL Flickr (top 18 with over 450 tags), 2 Science Gossip moderators, and recommended participants (6)
  • Moved to most recent participants (10 additional sent)
    • 3 Science Gossip interviewees
    • 4 BHL Flickr interviewees
    • Siobhan, Michelle, Geoff Belknap

23 of 35

Interview protocol

  1. Understand the reasons for your involvement in BHL illustration crowdsourcing in relationship to any involvement in crowdsourcing/ citizen science overall
  2. Evaluate successes and challenges of Science Gossip according to best practices
  3. Illustration use and sharing
  4. Future BHL illustration crowdsourcing

Informed by sources such as “Crowdsourcing Our Cultural Heritage” ed. Mia Ridge, “Citizen Science: Theory and Practice” online journal, Donelle McKinley’s evaluation principles for crowdsourcing cultural heritage

24 of 35

Key Findings

  1. Loss of highly engaged participants over time/ a handful of active participants now (both)
  2. Lack of community (BHL Flickr); Loss of community and lack of progress reports (Science Gossip)
  3. Range of relationships to BHL (both)
  4. Dedication to “the democratization of information” and willingness to overcome nitpicky data production (BHL Flickr)
  5. Specialized taxonomic skills (some BHL Flickr)
  6. Frustration with lack of demonstration of results (stated goals of indexing/access) and supportive of automated description (Key BHL Flickr volunteers)
  7. Motivation to pursue personal or group interests beyond tagging for ingestion, especially using Wikipedia (both)

25 of 35

Number of illustrations

Illustrations tagged:

Number of tags:

Illustrations tagged:

31,771

Number of tags:

134,176

26%

18 journals completed

200,000 or

0.04% of

5 million

26 of 35

Flickr Tag stats

U: 6,866 U:8,682 U:1,300 U:1,300 U:60

T: 56,410 T: 16,362 T: 3,255 T: 14,592 T: 1,095 T:42,000

27 of 35

Computer vision

  • Lepidoptera app, Google Arts + Culture Experiments, German trees herbarium specimens test case
  • Clarifai
    • Similar to other computer vision, but customizable (CogApp Presentation at IIIF conference)
    • 5M general model predicts and 5M custom predicts (up to 20 free concepts)
    • Open to our trying anything if willing to write a story for their blog

28 of 35

Special Thanks and ongoing advising

  • Michelle Marshall, Siobhan Leachman, Grace Costantino
  • Interviewees
  • CoderGirl UX course

29 of 35

CoderGirl UX course project

30 of 35

Preliminary

Recommendations

31 of 35

Goals

  • Vision: Empowering all to connect to biodiversity through historic scientific art
  • Framework: Exploration & integrating lessons quickly
  • Connect love of illustrations among scientists with that of the public (further understanding of biodiversity, support for science)
  • Fulfill desire to make information accessible according to personal or group interests (within constraints of limited management)
  • Pursue systematic data production as new technologies emerge
  • Achieve our existing crowdsourcing projects’ promise to provide better access to the metadata created

32 of 35

Brainstorming

33 of 35

Strongest ideas

    • Wikipedia community/ Wikipedian in Residence combined with computer vision
    • Continue to build campaigns with GLAMs and collaborate with Scientists and Historians of Science to share information about their work, species, and history of the field using illustrations as a touchpoint
    • Limited term/goal-based tagging (ex. fitting into campaigns for immediate results)
    • Access to tagged illustrations using IIIF

34 of 35

Continuing focus

Focus

Goal

Steps to get there

Enable Taxonomists’ search/browse in BHL portal

  • Validated interface design for taxonomists
  • Unsure about data production method for scientific names due to specialized human knowledge to do this; small sections of work/ compelling cases would be required; can also pursue conversations/ research on computer vision for this
  • Complete analysis of Pam’s other illustration survey questions
  • At MoBot:
    • Interviews to gain further context about main goals
    • Speak-aloud observations of task completion
    • Iterative prototyping to improve process (paper mockups including interaction stages)

Enabling Computer vision and Wikipedia

  • Basic processes and options for management of BHL and Wikipedia connection
  • A sense for community adoption
  • Test set of automatically ID’ed illustrations available on Flickr
  • Pending this, a determination of best platform for full set
  • Advising: Siobhan, Michelle, Wikipedia in Residence -- potential outreach and interviews with Wikipedia community
  • Clarifai API set up
  • Hosting Platform set up and upload support

Image user’s needs assessment, Recommendations for enhancements to BHL Portal, Proof of concept demo according to skills of Resident

35 of 35

Thank You!

Questions?