1 of 28

Cloud Forum – Secure Research Initiative – Lessons Learned

November 11, 2022

Barbara Schnell, Associate Director Secure Research Computing

2 of 28

  • Who we are
  • CMMC Journey
    • Before times
    • Campus Approach to CMMC
    • Lessons Learned
    • Automation

Topics

3 of 28

CU Research Computing Offerings

On-premise computing & storage

  • High Performance Computing
  • Large File Storage
  • On-prem cloud (Openstack)

Training & User

Support

  • Courses
  • Office hours
  • Help desk

Consulting

  • On-prem focused
  • Cloud focused

(ad hoc, emerging)

Cloud Enablement

  • AWS
  • Azure (emerging)
  • GCP (future)

Compliant Research Options

  • The Preserve (Azure Gov - CMMC)
  • Other tbd

4 of 28

  • Small dedicated sub-Team in Research Computing
    • 2 Cloud Engineers
    • 1 User Support / Cloud Analyst
    • 1 Security Operations Engineer
    • 1 Program Manager
  • Other supporting team members:
    • Cloud Security Architect
    • O365 Administrator
    • Compliance analyst

Secure Research Computing

5 of 28

Provide secure research storage, networking, computing and consulting services that address researchers’ needs for compliance with federal regulations while also enabling secure collaboration.

Mission

6 of 28

Governance

Export Control

Advisory Committee

Contracts & Grants

Security & Compliance

Financial Advisory

Research Computing

Researcher User Group

Emerging

7 of 28

CMMC JOURNEY

8 of 28

    • One System Security Plan per research project when required
    • Self-attesting
    • Limited Security Analyst staff for entire campus
    • New compliance requirements appearing and changing regularly

The Before Times

9 of 28

Campus Response to CMMC

Use Case / Need

Solution

Who benefits

Governance needed

Advisory Committee established by Security & Compliance Team

  • Principal Investigators
  • Other Faculty, Researchers
  • Sponsors
  • Collaborators
  • Compliance teams
  • Research supporting staff

Need for scalability

Microsoft Gov Solutions chosen

Need for compliant messaging & Collaboration

Office 365 GCC High applications

Need for compliant and scalable research computing resources

Azure Government CU Boulder Landing Zone

Need for funding

Start-up funding for staffing and tools to build environment via a project (temporary) funding mechanism

Need for ongoing service

Cost recovery / Ongoing funding model required upon project exit

10 of 28

Timeline

Update inspired by Rick. Thanks Rick.

11 of 28

Lessons Learned

12 of 28

          • Hiring was difficult amidst pandemic and continues to be a challenge.

          • Role clarification is ongoing as we stand up new services.

  • “Swivel-chair” approach between environments presents workload, context-switching and responsiveness challenges.

  • Mixed results using vendors to help with staffing capacity challenges during hiring delays. Discovery and learning were positive results but rework was required.

Staffing and Personnel Challenges

13 of 28

          • Sensitivity labeling end-user experience
    • Explaining the role of the sensitivity labels to end-users is challenging
    • Request for Outlook plugin for CUI came in from one tester (Titus)

          • Access to only browser version of apps proved to not be sustainable -> implementation of virtual desktop infrastructure.

  • Acceptance of Microsoft platform
    • Mac and Linux users not accustomed to using Windows machines
    • Request for Linux as a desktop presents maintenance vs user acceptance challenges.

O365

14 of 28

    • Per compliance guidelines, system information is to be treated as highly sensitive which means we need to find another ticketing tool to cover these scenarios:
  • User emails help system with specific system information
  • Technical teams need to share system information to implement change requests, share technical details on requests/incidents (assumes urls in environment are system information)
  • Exploring purchase of separate Fedramp  ticketing system (MS Dynamics, ServiceNow Fedramp) or continue to use Planner with manual processes.
  • Maintaining separate products for change/incident and repository management.

Need of Separate Ticketing System

15 of 28

Lack of Azure Gov Feature Parity

Custom script development and maintenance

Creation and maintenance of custom service tags and rules

Prevents features like single-sign-on across MS apps

Less flexibility for solutions and costs.

Creation and maintenance of additional firewall rules.

Several missing Sentinel connectors

(e.g. storage accounts)

Missing firewall / NSG FQDN tags

No support for Azure AD joined AVD hosts with FSLogix profiles.

Fewer instance sizes available and lower compute limits

Mix of gov-specific and public endpoints  that need to be allowed and managed. See US gov GCC endpoints

Cannot use Intune to manage session hosts using Azure AD DS

Exploring other endpoint management options

16 of 28

    • Usability: Disabled clipboard between environments
    • Special Application licenses costs (e.g. Acrobat)
    • Data ingress: experimented with multiple designs which have varying limitations, costs and risks. Implementing FileCloud now.
    • Development: most tools expect access to Github, even the Azure portal.

Challenges of Building an “Offline” Environment

17 of 28

    • Highly secure -> higher cost environment – general cost recovery feasibility is still being explored.
    • Annual license cost recovery may be overly complex to manage (e.g. during periods between projects).
    • Varying needs of applications between researchers/projects drives separate avd host pools which can simplify cost recovery but overall costs may be higher.

Cost Recovery

18 of 28

    • Compliance includes Business Processes and entity accountability which makes serving multiple business entities (campuses, universities) challenging – but not impossible.
    • Avoid over-documenting – less can be more.
    • Mindset for self attesting on compliance is very different from preparing for an external audit – requires persistent education and training.
    • Crawl-walk-run approach if you have time can help teams get better at interpreting controls and applying them over time.
    • Invest in ongoing technical security skills training for system administration and engineering roles to improve conceptual understanding and continue to establish a security and compliance mindset.

Compliance Complexity

19 of 28

    • Purchasing process took a long time
    • Went through 2 unsuccessful attempts to get a .gov domain name

Other - Miscellaneous

20 of 28

Automation

21 of 28

  1. Made the decision to invest time in incorporating Infrastructure as Code (IaC) into our sprints:
    • Free up team time to focus on researcher requests
    • Automat deployment of infrastructure and controls/standards to reduce opportunity for error
  2. Team completed Terraform / IaC training together.
  3. Knowledge sharing, code reviews of existing IaC in AWS
  4. Enabling work
    • Set up Githib Server
    • Set up docker for shared dev tools, libraries

Getting Started with Automation

22 of 28

  1. Decision to move infrastructure into code in discrete chunks vs all at once. Considered exporting everything built to-date using aztfy. May still use this tool for reviews.
  2. Moved an AVD host pool into Terraform.
  3. Knowledge transfer sessions for code reviews and lesson - “Terraform Tuesdays”.

First Steps

23 of 28

Automation - Lessons Learned

Isolated Environment

Build and maintain isolated devops tools / artifacts repository.

Create container to house devops tools repository

“Swivel” chair approach to outside resources for research and how-to resources

Import or hand-code reusable code from outside

Lack of features

No Azure DevOps as a service: Map jira tickets to planner cards and design docs or build Azure DevOps server

MS App Attach does not work with Azure AD DS: Exploring Remote app and possibly Ansible-type tool

24 of 28

  1. Considered developing in Azure commercial and moving to gov but lack of feature parity and required isolation introduced too much risk of rework.
  2. Building and maintaining our own artifacts repository (Terraform/Terratest resources, Terraform providers/api’s, module dependencies, Terratest Go packages, Python modules, VSCode extensions) will continue to be a challenge.
  3. Data ingress solution was not fully matured resulting in clunky workflow for importing large files such as VSCode extensions, aztfy (Azure Terrafy) executables, docker images, etc. Large file transfer tool implementation in progress (FileCloud).
  4. Possibly latency issues with using VA gov (slow response times). Experimenting with US Gov Arizona.
  5. New team – learning IaC concepts and Terraform – requires time to bring team along on approaches and resisting build-first-automate-later tendancies.

Other Automation - Lessons Learned

25 of 28

    • Automate firewall rules, subscription creation, other landing zone components,
    • Explore and experiment with automation options for application config and deployment (Acrobat, Matlab)
      • Balance platform agnostic goal with effectiveness of using cloud-native tools

3. Design and experiment with exporting code to repository for commercial platforms (AWS, Azure Commercial)

Automation – Next Steps

26 of 28

Develop

Review

Commit

Build

Publish

Provision /Deploy

Operate

Current State (tools used)

Dev tools, libraries

27 of 28

Develop

Review

Commit

Build

Publish

Provision /Deploy

Operate

Future State (possible toolsets to enhance automation)

Dev tools, libraries

Azure Image Builder

Azure Image Gallery

or

or

or

TBD…

or

X

28 of 28

Q&A