1 of 17

Engaging with the entire research

data management lifecycle

Wind Cowles, PhD

she/hers

Director of Data, Research, �and Teaching Services

windcowles@princeton.edu

Esmé Cowles

he/his

Assistant Director for Library IT

escowles@princeton.edu

Samvera Connect 2022

2 of 17

Addressing the Research Data Lifecycle

Princeton Research Data Service

Partnership between

  • University Library
  • Office of the Dean for Research
  • Office of Information Technology

Plan

Analyze

Preserve

Share Results

Process

Discover & Re-use

Acquire

3 of 17

Addressing the Research Data Lifecycle

Princeton Research Data Service

  • Data Management Planning
  • Data Wrangling & Organizing
  • Data Curation & Publishing�
  • Institutional data repository
  • DMPTool
  • Large data

Plan

Analyze

Preserve

Share Results

Process

Discover & Re-use

Acquire

4 of 17

Data Sharing is Hard*

“We will share our data by publishing it in a repository”

*If you wait until the end of the project to organize and document the data

Plan

Analyze

Preserve

Share Results

Process

Discover & Re-use

Acquire

?!?!

5 of 17

Office of Information Technology

Research Computing/PICSciE

�Department

�Instruments

�Faculty

Faculty Bookshelf

NAS

Enterprise

NAS

MOL BIO

NAS

PNI

Cluster

Scratch

Backups

Commercial

Cloud

�Library - Princeton Research Data Service

DataSpace

/tigress

/projects

6 of 17

Aligning needs into a service

Steadily increasing demand on storage for research

Distributed storage management

Need to ensure the long-term value of research data

FAIR principles

Good stewardship of limited resources

How to translate need into service

Institutional approach

Scalability

“We’re running out of space on the clusters… again.”

“How can I access my student’s data now that she’s left?”

“My funder says I need to make my data available.”

“I worked on some data a few years ago that I need now.”

“I want to share some of my data with my collaborators.”

7 of 17

TigerData is a comprehensive set of data storage and management tools and services that provides storage capacity, reliability, functionality, and performance to meet the needs of a rapidly changing research landscape and to enable new opportunities for leveraging the power of institutional data. It will include data management tools and services in concert with a community of practice that enables researchers and data managers to describe and tag their research materials, to seamlessly move them between different storage architectures, and to easily find their data and code for re-use and collaboration. The resulting ecosystem will facilitate research and�contribute to the stewardship of knowledge to enable �reusability and preservation of data and code produced�by Princeton University.

8 of 17

Local Data

Working Data

Archival Data

Presentation Layer

Other

Object

POSIX

Local

Storage

Open Repository

Other

Instrument

Cluster

Data Curation

  • Training & Consultations
  • Community of Practice
  • Designated Data Managers
  • Metadata governance
  • Core metadata collected at project start

Data Management Tools

Data Management Services

9 of 17

Plan

Analyze

Preserve

Share Results

Process

Discover & Re-use

Acquire

10 of 17

Why is Library IT �involved in this project?

11 of 17

What's the need?

  • Lots of integrations
    • Auth/role management
    • Service Now, grant management
    • Globus, ORCiD
    • Princeton Data Commons
  • Presentation layers have limited integrations and poor usability
  • Need to make this a more seamless experience

12 of 17

What can we offer?

  • Experience building apps with metadata forms and workflows
    • Princeton Data Commons
    • Figgy, Blacklight apps
  • Metadata expertise
    • DataCite, MARC, EAD, DC, etc.
    • IIIF, METS, PCDM
  • Strong commitment to standards, sustainability, and open source

13 of 17

What's the alternative?

  • Datasets will continue to come to the research data repository
  • We'll continue to ingest content to support faculty projects
  • We need better tools to manage these processes

14 of 17

Possibilities

  • Just solving some of the current �pain points will be a big win
  • Improving metadata and streamlining dataset ingest
  • Experience with digital collections tells me this could be transformative

15 of 17

Technical Working Group, Steering Committee Members, and Project Sponsors

Stephanie Ayers, PUL

Lori Bougher, DDSSi

Matt Chandler, PUL/PRDS

Dan Chin, OIT

Esme Cowles, PUL

Wind Cowles, PUL

Pablo Debenedetti, DFR

Jay Dominick, OIT

Natasha Ermolaev, CDH

Edward Freeland, DDSSi

Curt Hillegas, OIT/PICSciE

Beth Holtz, OIT

Anne Jarvis, PUL

Rishi Joshi, RC

Scott Karlin, CS

Rebecca Koeser, CDH

Carol Kondrach, OIT

Irene Kopaliani, RC

Robert Knight, RC

Kate Lynch, PUL

Keith Martin, OIT

Chris Miller, CS

Josko Plazonic, RC

Bess Sadler, PUL

Brian Seiler, RC

Chris Tengi, RC

Randee Tengi, PNI

Bill Wichser, PICSciE

John Wiggins, PNI

Carol Williams, OIT

16 of 17

PUL IT Team

Carolyn Cole

Stephanie Ayers

Hector Correa

Alicia Cozine

James Griffin

Vickie Karasic

Kate Lynch

Francis Kayiwa

Chuck McCallum

John Kazmierski

Bess Sadler

Robert-Anthony Lee-Faison

Philippe Menos

17 of 17

Thank You!

Wind Cowles, PhD

she/hers

Director of Data, Research, �and Teaching Services

windcowles@princeton.edu

Esmé Cowles

he/his

Assistant Director for Library IT

escowles@princeton.edu