1 of 75

Infrastructure Management

Google Private Cloud (GPC)

Lea Lonnberg-Hickling, UX Designer

Portfolio case study

2 of 75

Infrastructure Management

Role: Lead UX Designer

Timeline: Q2-2021

Launched: Q1-2022

Launch type: platform console

Sole and lead designer of Infrastructure Management, a new (at the time) platform for customers with data sovereignty needs, unlocking $XXB in revenue. Couples with Anthos Private Mode to provide customers with an end-to-end virtual private cloud solution.

Proprietary + Confidential

3 of 75

Project team

Lea

Lead UX Designer

Chad

Sr. UX Researcher

Aniket

Product Manager

Zewen

Software Engineer

4 of 75

Table of Contents

UX opportunity

User role

Designs

Launch

Appendix: design iterations

5-12

13-15

16-57

58-59

61-73

5 of 75

UX opportunity

GPC infrastructure management UX

6 of 75

Digital sovereignty

Context

Digital sovereignty refers to an organization’s ability to exercise autonomous control over data ownership, access, use and flow; and exercise control over the infrastructure used.

  • Rapid increase in customer concern on digital sovereignty.
  • Increased scrutiny on how data is stored, accessed and impacted by cloud service providers.
  • New policies, in EU especially, concerned with foreign cloud providers access to critical data.
  • Geopolitical landscape: embargoes, sanctions or other “black swan” events can happen with little notice and affect countries’ workloads.

7 of 75

Google Private Cloud offering

Google Private Cloud

Hardware

Software

Google’s answer to digital sovereignty concerns.

  • A separate, disconnected platform that can operate with no support from US vendor for 1+ year (catering to survivability needs during “black swan” events).
  • Build on top of Anthos, which allows customers to run applications outside of GCP, while still leveraging the benefits of Cloud.
  • Upholds the Anthos promise: build once and deploy anywhere.

8 of 75

What is Google Private Cloud?

Google Private Cloud is a new software and hardware solution built on top of Anthos. The GPC platform serves customers with digital sovereignty requirements that limit their sharing of data with public cloud providers.

GPC enables customers to build, deploy and monitor applications in their isolated on-premises environments, disconnected from Google.

9 of 75

GPC target customers

Data sovereignty

A subset of customers’ workloads need to answer sovereignty requirements, however the majority of their workloads can be operated by Google Cloud directly or as part of their Anthos Hybrid Strategy.

$XXB blocked revenue

Operational sovereignty

Customers’ workloads need to be operated independently from GCP with no data connectivity.

$XXXM enterprise + $X.XB public sector blocked revenue

Partner operated cloud

Customers’ workloads are in regions without a Google datacenter and need local isolation due to geopolitical or latency requirements.

$XXXM

10 of 75

UX opportunity

No more abstraction

Google Cloud customers are abstracted away from their underlying infrastructure. Customers run their applications in Google data centers, without having to worry about managing physical hardware (servers, etc.).

Google-configured hardware running on-prem

Private Cloud customers, contrastingly, are shipped hardware pre-configured and optimized by Google. The customer sets up their hardware, creating an isolated on-premises data center. (Customers can also choose to use a third party to manage their infrastructure.)

Disconnected mode

GPC runs in a disconnected mode (for GA), meaning no data is shared with or accessible by Google. In future, hybrid-mode will allow hybrid use of Google Cloud with Private Cloud. A break-the-glass scenario will also be designed for, allowing Google support on select resources for a finite period of time to assist customers in troubleshooting critical issues.

Unlike Cloud customers, Private Cloud customers need a solution for infrastructure management

11 of 75

GPC information architecture -

components &

personae

Marketplace

Identity & access

Cluster Management

Config management

Service mesh

VM Management

VM instance management

VM network/storage Ops

Support

Platform observability

Infrastructure Management

HW infrastructure management

Platform Admin

Infrastructure Operator

All or Some Combination

Multi-tenancy

Application Management

Operate workloads

Health checks, roll outs

Application Operator

GPC UI components (MVP)

Personae

12 of 75

Infrastructure management

What

  • A software suite that manages the lifecycle of Private Cloud hardware systems (compute, storage, networking), OS and platform components
  • Focus scenario: disconnected + customer (or 3rd party) managed

Why

  • Need a machine/OS management solution that can operate in disconnected mode
  • Enterprise customers expects single pane of glass to manage different infrastructure components in Private Cloud
  • Consistent across different hardware vendors
  • Integrated with Anthos software layer for streamlined operations

How

  • Delivered as Container based service, deployed locally on admin cluster worker node
  • Can be pre-installed during manufacturing assembly process

13 of 75

User role

GPC infrastructure management UX

14 of 75

Infrastructure Operator

Infrastructure Operators manage and maintain hardware infrastructure, such as servers, networking switches, and storage appliances to support optimal operation of platforms and services running on-premises. Infrastructure Operators include systems administrators, networking administrators, and more.

May also be called: System Administrator, Platform Operator Network Administrator, Storage Administrator

15 of 75

Infrastructure Operator

Responsible for

  • Administer hardware infrastructure, such as servers, networking and storage.
  • Set up and onboard tenants to the platform.
  • Provide support to tenants.
  • Monitoring, maintaining, patching / updating platform.
  • Troubleshooting and restoring services in case of issue

Engagement model

The Infrastructure Operator may represent the customer or a third party the customer has outsourced their infrastructure management to.

16 of 75

Designs

GPC infrastructure management UX

17 of 75

UX process

Five 90-minute remote research sessions

Participants with hardware infrastructure systems admin experience

Evaluated infrastructure management UX

Contextual interviews and usability studies

Usability tested end-to-end infrastructure

management UX

18 of 75

Monitor infrastructure

As an Infrastructure Operator, I need to monitor the status and details of my hardware components so that I can ensure my infrastructure is healthy and performant.

CUJ Tasks

  • Monitor the health and status of all the hardware components in my data center.
    • Racks
    • Servers
    • Networking witches
    • Storage components
  • Understand the relationship between the components so that I know where to send a technician to troubleshoot issues.

Infrastructure Operator

19 of 75

Dashboard and

global alerts

Final designs

20 of 75

Infrastructure Operator dashboard

Participants commented favorably on monitoring. The layers of progressive disclosure between the main dashboard and resource details and Grafana tested successfully.

Participants expected this dashboard to be their home page (updated after UXR).

GPC supports role-based views, meaning the IA and resource views are custom for GPC’s 3 user roles: Platform Admins, Infrastructure Operators and App Operators.

21 of 75

Participants found the single pane of glass, all in one solution unique and valuable. None of the participants had used software that manages servers, networking, and storage from a single system.

22 of 75

This is noiseless. It’s straight forward. Here’s what you’re looking at. Here’s your environment.

Sr. Systems Admin, PNC Bank

P2 (referencing the dashboard)

23 of 75

Global alerts

24 of 75

Notifications panel

The panel is accessible via the navigation bar.

In contrast to the dashboard, which shows an aggregate of status, the notifications panel displays single-resource level errors and warnings across the entire platform.

  1. “View” navigates the user to the resource.
  2. If applicable, “learn more” links the user to GCP docs.
  3. Users can filter the notifications to view a single type of resource, error, or filter by time period.

This pattern was leveraged from a release notes UX in another GCP product.

1

3

2

25 of 75

Resource detail entry point

Users can also activate the notifications panel from within a resource detail page.

  1. This server has 7 total notifications. The user selects the message bar to view them.

1

26 of 75

Notifications panel filtered

  1. The notifications panel is activated, filtering to display only wl-server-1’s issues.

1

27 of 75

Participants found global notifications very powerful and useful as a single spot to monitor events from anywhere in the system.

28 of 75

Racks

Final designs

29 of 75

Racks page

  1. An inventory of racks in the data center along with important status and identification data.
  2. Users can filter the table by status easily via the scorecard.
  3. The user can export the rack data into an easily shareable spreadsheet report they can share with management and team members (UXR finding).

Participants commented favorably on monitoring. The layers of progressive disclosure between the main dashboard and resource details and Grafana tested successfully.

The user selects rack-1 to view more details about the warning status.

1

3

2

30 of 75

31 of 75

On the rack, on the side, there are backup power units with the ethernet cable connected... For example, if the [server’s] UPS is failing… this will be able to give you that status. Although less used, but definitely this is one of those things that you’re not going to use it every day but it’s useful when you need it.

IT Admin, Optimose (Healthcare)

P1 (referencing rack pages)

32 of 75

Rack details

The “details” tab gives users more information about their rack.

  1. Users can view additional tabs to see what servers, networking and storage components sit on the rack. (See next slide)

1

33 of 75

The relationship between hardware components

GPC rack architecture

34 of 75

Rack details: servers

The user can view all the servers on rack-1.

  1. Selecting the server will navigate the user to the server details page.
  2. Placeholder actions for future (post-GA) UX.

1

2

35 of 75

Rack details: networking

The user can view all the networking switches on rack-1.

  1. Selecting the switch will navigate the user to the switch details page.
  2. Placeholder actions for future (post-GA) UX.

1

2

36 of 75

Rack details: storage

The user can view all the storage nodes on rack-1.

  1. Selecting the node will navigate the user to the node details page.
  2. Placeholder actions for future (post-GA) UX.

1

2

37 of 75

Servers

Final designs

38 of 75

Servers page

  1. Each hardware component has its own page, as some Infrastructure Operators may manage servers exclusively, for example.
  2. An inventory of servers on all the racks along with important status and identification data.
  3. The user can export the rack data into an easily shareable spreadsheet report they can share with management and team members (UXR finding).
  4. The user can customize the table columns.

Participants commented favorably on monitoring. The layers of progressive disclosure between the main dashboard and resource details and Grafana tested successfully.

The user selects admin-server-1 to view more details about the warning status.

1

2

3

4

39 of 75

Server details

  1. Message bar communicates the error occurring on admin-server-1.
  2. Links out to logging and monitoring dashboards in Grafana. This progressive disclosure flow of monitoring tested successfully (UXR finding).
  3. Detailed information about admin-server-1, helping users understand what needs replacing, fixing and reconfiguring.
  4. Actions will include editing or a lock icon for immutable fields.

1

3

2

4

40 of 75

Server operations

41 of 75

Server actions

  1. Once the user selects a server, the actions available to that resource become active.

The user selects to turn off the server (potentially to replace or fix it).

1

42 of 75

Interstitial dialog

Before the action is performed, the dialog communicated the impact of the action to the user.

Participants need to understand the scope and impact of shutting down a resource.

43 of 75

Toast: in-progress status

  1. Message communicating the server is in-progress of shutting down.

1

44 of 75

Toast: complete status

  1. Message communicating the server is turned off.
  2. Status icon changed to turned off.

1

2

45 of 75

Networking

Final designs

46 of 75

Networking page

The user can view all the TOR switches and management switches across all the racks in their data center.

TOR switches: hardware that connects devices on a computing networking, allowing resources to communicate with each other.

Management switches: a device that allows users to monitor their network and control traffic.

47 of 75

Switch details

  1. Links out to logging and monitoring dashboards in Grafana. This progressive disclosure flow of monitoring tested successfully. Button labels updated to provide better feed-forward for users (UXR finding).
  2. Detailed information about admin-server-1, helping users understand what needs replacing, fixing and reconfiguring.

The selects to view the monitoring dashboard for mswitch-1.

1

2

48 of 75

Grafana dashboard

A new window is opened with Grafana dashboards for mswitch-1.

In order to provide users with rich data on their infrastructure, GCP is relying on open source tooling to enhance the UX while still meeting customers’ digital sovereignty requirements.

Post-GA, product and UX will decide what components to bring into the GPC platform UI.

Participants commented favorably on monitoring. The layers of preogressive disclosure between the main dashboard and resource details and Grafana tested successfully.

Participants liked the Grafana dashboards and wanted even more granular data available in Grafana.

1

49 of 75

I think it’s good that we have monitoring dashboards here. That’s really cool. That is very helpful. Incoming traffic and outgoing traffic, good. So you can look at the interfaces and see if it’s been up or down. Oh, it’s been up 105 days. That’s good. Good switch!

Systems Admin, University of Washington

P3 (referencing the switch monitoring dashboard in Grafana)

50 of 75

Storage

Final designs

51 of 75

Storage page

Storage clusters are conceptual groupings of storage components.

Most customers will have one, max two storage clusters.

Users can view a roll-up of status across the storage clusters nodes, disks and capacity.

The user selects to view the components within s-cluster-2.

52 of 75

Storage cluster details

Storage clusters are conceptual groupings of storage components.

  1. Users can tab to view storage nodes and disk pools within this storage cluster.
  2. Improved storage concepts’ definitions after study.

1

2

53 of 75

Storage cluster details: storage nodes

Storage nodes run backup software ensuring the safe-keeping of stored data. Storage nodes act in pairs and are physically connected to their partner node.

  1. Scorecard rolls up the status of all the storage nodes within the storage cluster.
  2. Users can navigate directly to the storage node and rack details pages.
  3. Placeholder action for post-GA.
  4. Easy access to the storage nodes’ monitoring and logging dashboards in Grafana.

The user selects s-node-1 to see why a warning is firing...

1

2

3

4

54 of 75

If I was the admin I would walk down to the basement… go to this rack… and then yank out this specific [storage node] pair and see this hard drive.

IT Admin, Fiserv (Financial)

P4 (referencing storage pages)

55 of 75

Storage node details

Storage nodes run backup software ensuring the safe-keeping of stored data. Storage nodes act in pairs and are physically connected to their partner node.

  1. Detailed information about admin-server-1, helping users understand what needs replacing, fixing and reconfiguring.
  2. Links out to logging and monitoring dashboards in Grafana. This progressive disclosure flow of monitoring tested successfully.
  3. Placeholder actions for post-GA.
  4. Actions include editing or a lock icon for immutable fields.

The user navigates back to the storage cluster details page.

1

3

2

3

4

56 of 75

Storage cluster details: disk pools

Disks are storage mechanisms where data is recorded and stored.

Disk pools are groups of disks across storage nodes.

  1. The user can view which primary storage node the disk is tied to.

The user selects s-group-1 to see why a warning is firing.

1

57 of 75

Disk pool details

Disks are storage mechanisms where data is recorded and stored.

  1. The user is navigated to the disk pool details page where they can see which storage nodes and clusters the disk pool is serving.
  2. In the disks tab, users will find all the disks within the disk pool.

1

2

58 of 75

I did really like this system… the UI and everything is pretty clean and understandable.

Network Engineer, American Chemical Society

P5 (during session wrap-up)

59 of 75

Launch

GPC infrastructure management UX

60 of 75

Launch

Private preview with ST Engineering Q3-2021

Singaporean tech company with aerospace, defense, marine and public security divisions. Private preview September 2021.

Google Private Cloud GA launched in Q1-22

Infrastructure management along with GPC’s other MVP features launched January 2022.

Google Private Cloud unlocks $XXB in customer opportunities

Unblocked discussions with confidential customer prospect.

Google Private

Cloud Launch Impact

61 of 75

Thank you

62 of 75

Appendix: design iterations

GPC infrastructure management UX

63 of 75

Early concepts

Appendix

64 of 75

Early concept 1

65 of 75

Early concept 2

66 of 75

Early concept 3

67 of 75

Early concept 3

1

3

2

68 of 75

Early concept 3: alerts

1

3

2

69 of 75

Component details template 1

1

3

2

70 of 75

Component details template 2

Post-GA, include charts for richer component-level data.

1

3

2

71 of 75

Storage designs tested

in study

Appendix

72 of 75

Storage nodes

  1. Storage cluster concept represented in table

1

73 of 75

Storage cluster details

74 of 75

Disk pools

75 of 75