Published using Google Docs
RCCI - Research Storage Services Document
Updated automatically every 5 minutes

UTech Research Computing and Cyberinfrastructure

Research Storage Services Document

Version 1.4

1.0 Purpose and Scope

The purpose of this document is to describe University Technology (UTech) Research Computing and Cyberinfrastructure (RCCI) storage services, fees, and operating attributes. This document clarifies the responsibilities and procedures of RCCI and the Customer to ensure Customer needs are met in a timely manner. This document does not address high performance (“Panasas”) storage that is operated as part of the High Performance Computing (HPC) service.

2.0 Description of Storage Services

2.1 Research Storage Subscription Service

The standard research storage service provides easy, reliable, and immediate access to research data from on-campus offices (as a mapped drive in Windows or an NFS mounted filesystem in Linux or OSX) as well as from the HPC cluster systems. All data is replicated between the two CWRU data centers to protect against a data center disaster. This service is currently based on commercial product offerings from Qumulo and has the following characteristics:

Research Storage Attributes

Storage accessible through Globus, SSH/SFTP, NFS, and SMB

All /mnt/rstor volumes are mounted on HPC compute nodes

Appropriate for large and small project related data processing beyond /home /scratch data quotas

Protected using Qumulo erasure coding technology and able to withstand multiple drive failures

Protection against accidental file deletion within a 7 day period

Protection against catastrophic failure of the primary storage system by replicating data to a separate system

Payment for storage is upfront and annual using a CWRU Speedtype

Research Storage is provided with the following service option:

Service Option

Description

Dual Copy

Appropriate for critical data and the default option. Data is stored on an enterprise class system capable of withstanding multiple drive failures. Data is replicated nightly to a secondary system. Snapshots are included for file recovery purposes within a seven day window. UTech recommends a third copy of data if it is determined to be non-reproducible.

Service Cost

$180 per TB of storage allocated per year

2.2 Research Archive Storage Subscription Service

Research Archive Storage provides the CWRU research community an option to store data that has long term value, though infrequently accessed. The system archives data generated from research activities and allows for convenient movement of large and diverse datasets into a UTech managed system. The Research Archive Storage system is based on storage services provided by the Ohio Supercomputer Center. The service provides the following storage attributes:

Research Archive Storage Attributes

Available to faculty members of Case Western Reserve University for research purposes.

Accessible through the Globus data management platform.

Subscriber is provided with an archival directory for upload and retrieval of data.

All files in the archive are backed up daily, with a single copy written to tape.

The retention period is 90 days for any deleted/overwritten data. Fourteen versions of files are stored for overwritten data; additional versions will be deleted in a chronological order.

Data is stored in an archival directory for five years. After five years, a subscriber must renew their subscription. If not renewed, data will be kept on disk for one week, then archived for one month and then permanently deleted.

Increases in storage allocation must be ordered in additional 5 TB increments for five years.

Previous existing allocations and additional new allocations will be co-termed at the directory expiration date. For example, a subscriber with an existing allocation with two years remaining, will only need to pay for the extension after two years. If the subscriber would like an additional allocation now, this will be co-termed with two years of service.

The following characters are not allowed in Research Archive object names for reasons of cross-platform compatibility: control characters such as carriage return (CR) and line feed (LF), double quotation mark (“), asterisk (*), question mark (?), less than sign (<), greater than sign (>), backslash (\), vertical line (|)

Faculty members can store up to 25 TBs of data at the subsidized rate. Above 25 TBs will be charged per non-subsidized rate.

Faculty that utilize the previous generation tape-based archive will be grandfathered for five years at the previous rates for storage. After July 2026, they will incur the new established rates.

There is no provision for a refund for this service.

This service is heavily subsidized by UTech to aid in data reproducibility and fulfilling contractual requirements. UTech retains the right to cap utilization of this service at any time.

Research Archive Storage is provided in multiple service options:

Service Option

Description

Active Archive

Data is stored in an archival directory for five years.

Subsidized Rate

 $200 per 5 TB for five years of capacity. Rate subsidized to 25 TB.

Non-Subsidized Rate

 $960 per 5 TB for five years of capacity > 25 TB per faculty member.

2.3 Research Dedicated Storage Service

Research groups may purchase a dedicated storage server through UTech. This option provides dedicated hardware that fits researchers’ needs and can be used as a dedicated storage system.

To keep costs low, commodity hardware is used in this service. The hardware utilized does not include high availability features. The target service availability is 98.0% uptime but occasional scheduled or unscheduled outages may occur and will impact availability unlike our other storage offerings.

Research Dedicated Storage Attributes

Acquisition of equipment must be CWRU faculty sponsored

All /mnt/rds volumes are mounted on HPC head nodes

Faculty sponsor is responsible for:

  • managing resource allocations including membership and quotas
  • proper classification of data and safeguards per University policy

Storage is accessible through:

  • Common data transfer protocols (Globus, SSH/SFTP) via HPC login nodes or other authorized KSL data center hosted servers.

Storage is not expandable beyond the initial system configuration.

(Additional capacity can be acquired through the purchase of an additional storage system and should be treated as separate from any prior storage systems.)

Appropriate for large data storage that is not heavily IO dependent

Data is protected using ZFS RaidZ-2 or higher and is able to withstand multiple drive failures

Due to ZFS design, used storage capacity cannot go over 80% of the stated file system usable capacity. The reserved 20% capacity is utilized for snapshots and other file system operations and must be maintained.

Protection against accidental file deletion within a determined day period using snapshot technology

Protection against catastrophic failure of the primary storage system is not a default option. Please contact UTech for more information.

Backup and recovery services are available for additional cost.

Payment for storage is upfront for hardware acquisition and maintenance support.

The system uses commodity hardware that does not include high availability features. The target service availability is 98.0% uptime but occasional scheduled or unscheduled outages may occur.

UTech manages the servers including patching and other administrative tasks.

A maximum of two hours per week of UTech staff time will be provided for administrative tasks for this service. Tasks that take longer are subject to service fees.

To control service costs, UTech provides a standard set of storage systems based on commodity hardware for this service. Please contact RCCI for more information.

Research Storage is provided with the following service option:

Service Option

Description

Dedicated Storage

Appropriate for storage of large data files that do not require high availability or intensive IO.

Data is stored on a commodity class system that can withstand multiple drive failures.

Snapshots are included for file recovery purposes within a seven day window.

This hardware does not include high availability features. The target service availability is 98.0% uptime but occasional scheduled or unscheduled outages may occur.

Service Cost

Please contact UTech for more information. Costs vary due to projected capacity and desired resiliency.

3.0 Service Objectives

Unless noted in Section 2.0, the goal is 99%  availability during the defined service hours, excluding scheduled maintenance downtime or disasters outside the control of RCCI.

3.1 Service Availability

Unless noted-

3.2 Service Requests  and Incident Reporting

All Customer requests for service and service calls including incident reports should go through the CWRU Service Desk at 216.368.4657.

3.4 Escalation Procedures

During normal business hours, the Customer should contact the UTech Service Desk for all urgent and non-urgent incidents. If the situation is not responded to in a reasonable amount of time, the Customer should communicate with RCCI at rcci-support@case.edu.

3.5 Access Controls and Security

User accounts and group assignments of files and folders must be managed by the RCCI AMARA ( https://amara.case.edu) membership system. Customers should communicate with RCCI at rcci-support@case.edu regarding complex group assignments and other questions regarding the security of the system.  

3.6 Handling of Restricted Data (ePHI)

No clinical research data involving patient data, often referred to as electronic personal health information (ePHI) is allowed on RCCI research storage systems. Please contact UTech for more information regarding the Secure Research Environment and other ePHI sanctioned systems.

3.7 Services and Fees

Refer to Section 2.0 of this document regarding service options and associated fees.

3.8 Exclusions

As with any service RCCI cannot guarantee timeframes for the following situations:

4.0 Version History

Version

Authors and Changes Made

Date

1.0

Mike Warfe, Roger Bielefeld, Hadrian Djohari, Erin Fogarty

Change: Initial draft of document with edits.

10/06/2017

1.1

Roger Bielefeld, Mike Warfe. Change: Clarification of purpose and scope to denote “Panasas” storage is outside document scope. Clarification on the research archival service to denote tapes must be purchased from UTech. Clarification on the research dedicated service to denote that only a standard set of servers can be purchased for use.

02/20/2018

1.2

Mike Warfe. Change: Updated storage fees and technology

01/24/2020

1.3

Mike Warfe. Change:  RDS service attributes changed to note that storage is only mounted on HPC head nodes.

1/15/2022

1.4

Mike Warfe. Change: Updated the research archival storage attributes to reflect changes to the service.

2/8/2022