1 of 43

1

Cloud @ U-M

Making Carrot Cake from Carrots

Alok Vimawala

Information & Technology Services

University of Michigan

avimawal@umich.edu

734-217-4595

Martin Sager

Information & Technology Services

University of Michigan

mjsager@umich.edu

2 of 43

2

Cloud @ U-M

Academic

Enterprise

3 of 43

Enterprise

4 of 43

4

Cloud computing is as another tool in the toolbox�(pragmatic)

Use the scale/expertise of enterprise to benefit research�(benefit the mission)

5 of 43

Enterprise in the Cloud

5

  • Tiering approx. 2PB infrequently accessed files from on-prem Isilon to S3 to avoid capital expenditures
  • Set-up an outpost in AWS - Virtual Data Center - to facilitate moving enterprise services in the cloud
  • Moving/extending services that need to work even if campus is offline

6 of 43

Benefits to Research

6

  • Storage tiering resulted in Private Storage Pricing that benefits ALL AWS @ U-M users
  • Virtual Data Center is used to host the Container Service that is gaining popularity with researchers
  • Secured and managed machine images

7 of 43

Academic

8 of 43

Agenda

8

Enable fast, easy, and secure access to public cloud services

9 of 43

Vital statistics

9

  • 450 accounts/subscriptions/projects
  • Staff - 3 FTE plus management
  • Direct contract w/AWS & GCP
  • Using SoftChoice for Azure
  • BAA in place with all three providers

10 of 43

# of accounts over Time

10

11 of 43

How & What?

11

12 of 43

12

Onboarding

Grant Assistance

Guardrails

Enterprise Features

Training

Consulting

13 of 43

Onboarding

13

  • Central website to intake account creation and modification requests
  • Usually takes about 1 day to fulfill request
  • A place to ask questions about data types, security contact, billing information, etc.
  • Ensure that only approved data types are being used for AWS, AWS GovCloud, GCP, & Azure

14 of 43

Consulting

14

  • Work closely with faculty, students, and unit IT staff to understand needs
  • Investigate and propose products, solutions and architectures to meet their need
  • Engage with AWS, GCP, and Azure teams to bring in additional expertise
  • Free!!

15 of 43

Training

15

  • Weekly office hours
  • Ad-hoc and organized training events
  • Work with campus users to understand training needs
  • Work with cloud providers and campus units to determine best approaches and topics for training
  • Also provide ad-hoc training on an as-requested basis

16 of 43

Training

16

Date

Host

Participants

Topic

February 2018

ITS

35

GCP

May 2018

ITS

45

GCP

May 2018

ITS

50

AWS

June 2018

ITS

26

AWS

October 2018

ITS

26

GCP Architect

October 2018

ITS / Webinar

45

AWS Storage

January 2019

ITS

22

GCP Architect

February 2019

ITS / Webinar

75

AWS Re:Invent Recap

February 2019

School of Information

45

AWS

March 2019

College of LSA

27

AWS

April 2019

Ross School of Business

35

AWS

August 2019

Machine Learning for Healthcare Conference

100+

GCP Big Data

November 2019

MiDAS Symposium

TBD

AWS & GCP

17 of 43

Grant Assistance

17

  • Make campus community aware of grant and research credit programs
  • Assist faculty with:
      • applying for research credits
      • applying for teaching credits
      • cost estimates

18 of 43

Guardrails

18

  • Console access via U-M credentials plus MFA
  • Access to accounts tied to U-M MCommunity directory group membership
  • No access to root key
  • Account activity is logged to U-M Splunk instance
  • Optional - VPN connectivity to extend U-M RFC 1918 IP space into VPC

19 of 43

Guardrails

19

20 of 43

Guardrails

20

21 of 43

Enterprise Features

21

  • Consolidated billing - pay using shortcode - no POs!!
  • Egress waiver
  • Negotiated discounts (AWS & GCP)
  • Enterprise Agreement
  • Business Associates Agreement
  • Private Storage Pricing (AWS)

22 of 43

Case Study # 1: Enabling the use of AWS for HIPAA data

Make it easier to use our services than avoid them

22

23 of 43

23

  • Faculty and researchers are using and analysing increasing amounts of data for their work
  • Need to process large amounts of data in memory
  • Need access to GPUs for machine learning

Local Workstation

Virtual Machine

Local HPC Cluster

???

24 of 43

Don’t we already have a BAA?

24

  • BAA is just the beginning of the journey
  • U-M still needs to ensure that its responsibilities under the shared responsibility model are met
  • Six distinct groups need to work together to ensure proper security and compliance

25 of 43

Customer is responsible for security in the cloud

customers = ["ITS", "HITS", "Researcher"]

25

Cloud Provider is responsible for security of the cloud

26 of 43

AWS HIPAA Account Creation (before)

26

Faculty requests AWS account for HIPAA data

Central IT - ??

Unit IT - ??

IT Security - ??

Compliance - ??

Faculty - WTF??!!??

Torches & Pitchforks

27 of 43

What did we do?

27

  • Limited the scope to “workstation in the cloud”
  • Developed a two part solution to ensure compliance
      • Develop a set of base technologies and training that apply to everyone wanting to use the cloud for HIPAA data i.e. 70%
      • The remainder 30% is specific to the use case

28 of 43

Technical Specifications

28

  • Only EC2, EBS, and S3 services are allowed
  • VPN connectivity to campus is required
  • Traffic to VPC is restricted to only campus networks
  • Only certain pre-approved AMIs are allowed
  • EC2 instances are automatically patched
  • EC2 instance creation via native interfaces

29 of 43

AWS HIPAA Account Creation (after)

29

Faculty requests AWS account for HIPAA data

Unit IT works with faculty to assess request and make sure that the 30% is addressed

Account created

Opposite of Torches & Pitchforks

30 of 43

EC2 Instance Creation - Took approx. 1 day

30

Faculty requests workstation

Unit IT

  • create EC2 instance
  • assign it to correct security group

Unit IT

  • join Active Directory domain
  • create users & groups
  • install & configure Splunk, Sysmon, & Duo

Faculty able to use machine

31 of 43

EC2 Instance Creation - Approx. 5 minutes

31

Faculty requests workstation

Unit IT

  • create EC2 instance
  • assign it to correct security group

Unit IT

  • join Active Directory domain
  • create users & groups
  • install & configure Splunk, Sysmon, & Duo

Faculty able to use machine

Automagic happens here

32 of 43

Case Study # 2: Machine Learning for Healthcare Conference

We are here to help

32

33 of 43

Machine Learning for Healthcare

33

  • Annual conference of clinicians, computer scientists, data scientists, statisticians, etc. focussed on using machine learning to improve patient care
  • U-M hosted the 2019 conference in August
  • First ever Community Data Challenge

34 of 43

MLHC - Community Data Challenge using GCP

34

  • Create a central data repository in BigQuery with de-identified data from 2-3 different sources
  • Allow access to data in a way that minimizes data duplication and controls access to the data
  • Provide participants with Cloud Datalab instances to use for their work

35 of 43

35

36 of 43

MLHC - Community Data Challenge

36

  • Everything was automated using Terraform
  • Close collaboration between Google, ITS, conference organizers, corporate compliance, and others
  • ITS staff were on hand during the datathon to help with problems, answer questions, get participants started, …

37 of 43

MLHC - Community Data Challenge

37

  • The event was very successful
  • 153 participants
  • 11 teams
  • The Community Data Challenge is continuing throughout this year

38 of 43

Coming Soon(ish)

38

  • Support for regulated data
      • Other cloud providers
      • Other types of regulated data
  • “Approved” machine images
  • Budgeting and spend alerts
  • More automation
  • More consulting

39 of 43

Coming Soon(ish)

39

40 of 43

40

Q/A

Alok Vimawala

Information & Technology Services

University of Michigan

avimawal@umich.edu

734-217-4595

Martin Sager

Information & Technology Services

University of Michigan

mjsager@umich.edu

41 of 43

Consulting

41

“Chatted with <research group> today about their app. They were trying to use some networking features as part of lambda and thought NAT was causing problems. NAT was a red herring. The resources were actually launched in different VPCs. We moved the lambda function and changed some security groups. Easy fix!”

42 of 43

Consulting

42

“Talked with <faculty member> last week. Her research is in tech accessibility and had some issues connecting to her RDS database. The issue was a misconfiguration of her DB. She ended up putting her DB behind a NAT (good) but then could not connect from campus. We talked briefly about the architecture of the app, but next week I will meet with her lab to discuss it in depth.”

43 of 43

Consulting

43

  • University Library and U-M’s HathiTrust is looking for a different way to backup their data
  • They currently backup to on-prem tape library
  • Assisting with solution engineering and analysis (storage cost, retrieval time, retrieval fees, etc.) to compare different options including cloud storage