1 of 21

National Agricultural Library (NAL) Scientific Data Services

National Agricultural Producers Data Co-operative

August 22, 2022

Peter Arbuckle

Scientific Data Management Branch

National Agricultural Library, Agricultural Research Service, 

United States Department of Agriculture

1

National Agricultural Library

United States Department of Agriculture

2 of 21

Agenda

  1. Who/What is NAL…
  2. NAL Scientific Data Mission
  3. Scientific Data Services and Products

2

National Agricultural Library

United States Department of Agriculture

3 of 21

National Agricultural Library

USDA established in 1862…

“to acquire and to diffuse among the people of the United States useful information on subjects connected with agriculture and rural development”

4 of 21

NAL Scientific Research Data Services

  • NAL supports scientific research throughout the process
    • Reference, access to literature, Evidence Synthesis/Systematic Review
    • Description and cataloging  
    • Digital Preservation (Coming soon)
    • Meta-analysis (OD)
  • Data Services: Data management planning, data description, and access (with underpinning technical workflows)

4

National Agricultural Library

United States Department of Agriculture

5 of 21

NAL Scientific Data Mission

•NAL scientific data services help USDA funded research communities make their data “Findable, Accessible, Interoperable, and Re-usable (FAIR).” We do this by helping create rich, well-structured machine-readable metadata, providing channels to publish data and metadata, and infrastructure for data preservation. 

•In other words:  We provide library services such as metadata support, repository access, and catalog functions for data products from specific research communities

5

National Agricultural Library

United States Department of Agriculture

6 of 21

FAIR data principles

6

Findable

    • Rich metadata
    • Persistent identifiers

Accessible

    • Fixity
    • Data & metadata available to target audience

Interoperable

    • Open formats
    • Common metadata standards
    • Controlled vocabularies

Reusable

    • Usage license
    • Provenance
    • Community standards

FAIR Principles

https://www.force11.org/group/fairgroup/fairprinciples

National Agricultural Library

United States Department of Agriculture

7 of 21

7

Why FAIR? The Research Data Lifecycle

Research data is an investment.  An investment with greater return if it can be reused in future research.

Making data reusable requires specialized knowledge and expertise in the subject matter as well as modern library sciences.

Diagram from Rüegg et al. 2014 in Front Ecol Environ doi:10.1890/120375

National Agricultural Library

United States Department of Agriculture

8 of 21

What we do: Scientific Data Services

Provide services at various stages of scientific data life cycle

  • Plan. What data will be collected, how it is collected, how it will be maintained. 
    • Data Management Web Resources (e.g. for selecting data repositories)
    • Data Management Plan Review
    • Preparation according to Best Management Practices (metadata standards)
  • Describe. Data must have documentation to be usable
    • Metadata and curation services - work with a variety of established and emerging metadata standards
    • Interconnectivity through metadata elements – PIDs, terms, etc.
  • Publish and Preserve. Access to the data at multiple timescales
    • DOIs
    • Repository service through ADC and resources on selecting subject-specific repositories
    • Data preservation
  • Discover. Tools and workflows to find and discover data
    • Multiple paths to published data through UI of a variety of systems
    • Connection of data to literature
  • Integrate. Bringing different data products to bear on research questions
    • Develop workflows and tools for LCA data integration
    • Work with community on identifying key parameters on integration and reuse
    • Building capacity for automated data integration

8

Focus on specific customers and user communities.

National Agricultural Library

United States Department of Agriculture

9 of 21

Data Services: Challenges and Opportunities

Library science and services for data is new for many scientific communities

    • Standards often don’t consider new media / information types
    • Consensus and Best Practices are still in development
    • Technology and how it's applied is evolving rapidly

NAL works with research groups to share best management practices and solve informatics and access challenges.

A library’s role in this part of research is continuously evolving. 

9

National Agricultural Library

United States Department of Agriculture

10 of 21

How we do it: close collaboration with customers

NAL experts are embedded in the specific scientific communities, lead or participate in working groups, communities of practice 

    • Understand needs and challenges of our customers
    • Develop customer-driven requirements for tools and services we provide  
    • Raise awareness about recommended data stewardship practices, existing tools and services in support of data management in the context of customer objectives
    • Provide policy input
    • Establish relationships of trust and foster new collaborations

10

“Get closer than ever to your customers. So close that you tell them what they need well before they realize it themselves.”

Steve Jobs

National Agricultural Library

United States Department of Agriculture

11 of 21

Customer-focused use-case based services

  • Primary customer groups are: 
    • USDA-supported ag researchers and their collaborators interested in accessing or in providing access to agricultural research data and working with them data managers
    • ARS and USDA administrators involved in data governance and data management policies development and implementation

11

National Agricultural Library

United States Department of Agriculture

12 of 21

USDA Departmental Regulation 1020-006: Public Access to Scholarly Publications and Digital Scientific Research Data

Scope: All unclassified scientific research (intramural and extramural) supported wholly or in part by the USDA, regardless of the USDA funding level or funding mechanism

Scholarly Publications must be made publicly accessible via PubAg

Digital Scientific Research Data:

Data Management Plans must accompany all scientific research

Public Access to research data is required* through publication on an appropriate repository and a metadata catalog entry in the Ag Data Commons

(*Excludes data containing PII or other types of sensitive data)

Persistent Identifiers for Researchers (e.g., ORCID iD) must be used by authors of scholarly publications and digital scientific research data assets.

National Agricultural Library

United States Department of Agriculture

13 of 21

13

PubAg is… 

  • A search and discovery interface to peer-reviewed journal articles in the agricultural sciences
  • The USDA public access archive system hosted be the National Agricultural Library
  • Available and free to search online: https://pubag.nal.usda.gov

National Agricultural Library

United States Department of Agriculture

14 of 21

Ag Data Commons services

14

Catalog services support:

  • Compliance with federal Open Data policy through a feed into data.gov

  • Cross-domain discovery of agricultural research data through harmonized metadata

Repository services provide:

  • Public access, storage and preservation for long tail data (data without subject-specific repository)�
  • DOI minting for deposited datasets, making them accessible and citable

Value-added services include:

  • Expert data curation service and data management consultations�
  • Development of microservices for ingest �
  • Collaborations with customer community on addition of tools and workflows facilitating data analysis and integration

https://data.nal.usda.gov/

Describe

Publish&Preserve

Discover

Plan

Integrate

National Agricultural Library

United States Department of Agriculture

15 of 21

Ag Data Commons dataset

15

Author ORCID

Source data

Methods article

Collection / parent dataset

DOI

Suggested dataset citation

Published articles & manuscripts

Related datasets and previous versions

Descriptive metadata

Data files

Controlled vocabularies

National Agricultural Library

United States Department of Agriculture

16 of 21

The i5k Workspace@NAL(https://i5k.nal.usda.gov)

  • Provides access to 83 arthropod genome projects and counting;
  • Improves fundamental research data via community-driven gene curation of over 15,000 gene models;
  • Integrates data with primary repositories for sequence or other data;
  • Provides webinars, tutorials, and training for the i5k community

16

National Agricultural Library

United States Department of Agriculture

17 of 21

I5k Workspace@NAL customers

  • I5k = 5,000 insect genomes
  • Scientists from across the world, in academia and government, working on insect genomes
  • ~75 ARS scientific customers, including Ag100Pest

17

National Agricultural Library

United States Department of Agriculture

18 of 21

Life Cycle Assessment Commons

  • Provide access to LCA research products through open repository
  • Develop repository and software workflows for discovery, access, and integration
  • Develop and maintain metadata guidance and best practices
  • Convene and lead community-of-practice around LCA data modeling and interoperability
  • Education and outreach on LCA data best practices, and advancing LCA data interoperability / integration

18

National Agricultural Library

United States Department of Agriculture

19 of 21

Website and Repository�www.lcacommons.gov

19

https://www.lcacommons.gov/

Focus on institutional customers:

providers and distributors

National Agricultural Library

United States Department of Agriculture

20 of 21

Geospatial Data Services 

  • Supporting USDA Enterprise Geospatial Management Office to implement metadata requirements of Geospatial Data Act of 2018
    • SME services for ISO 19115 metadata, the new standard
    • Leading USDA Geospatial Metadata Working Group

  • Developing services to help USDA agencies efficiently create effective geospatial metadata as required by the GDA

20

National Agricultural Library

United States Department of Agriculture

21 of 21

Summary

NAL provides services to make scientific research data FAIR

Essentially, what is unique:

  • Working closely with research communities and with evolving standards/BMPs to make data FAIR. 
  • Help develop the informatics foundation for advanced technology application.

21

National Agricultural Library

United States Department of Agriculture