1 of 26

Skyway: A Seamless Platform that Democratizes Cloud Computing for Researchers

Trung Nguyen and Hakizumwami Birali Runesha

Research Computing Center, University of Chicago

May 19, 2026

Cloud Forum 2026

University of Wisconsin, Madison, May 19-21, 2026

2 of 26

Our Team

Birali Runesha

Varun Sharma

Himi Yadav

Ross Hyman

Trung Nguyen

Yuxing Peng

Virender Kumar

3 of 26

What are the urgent needs?

  • Burst workloads to commercial clouds.
  • Do time-sensitive and highly demanding work.
  • Hardware and specialized services only available on the cloud.
  • Very minimal (or no) learning curve for HPC users.
  • Easy to manage group accounts.

4 of 26

Cloud access for all should be easy

  • Seamless transition between HPC and cloud environments
  • Support cloud credit providers
  • Avoid cloud vendor lock-in
  • Monitor and control usage at user level
  • Recommend optimal instances for specified workloads

5 of 26

Skyway

is a hybrid solution that allows HPC users to submit workloads to either on-premises or cloud resources from a unified environment.

allows bursting jobs to different cloud platforms seamlessly with no learning curve and operating effort.

offers a billing module, prevents exceeding a budget allocation and manages a research group expenditure.

A hybrid infrastructure, with the ability to extend on-premises resources to-cloud, can combine advantages of on-premises and cloud resources.

► HPC: on-premises or cloud?

On-premises solutions

  • provide a high level of control
  • more cost-effective under high utilization

What is Skyway?

Cloud computing solutions

  • are highly elastic
  • handle variable demands of various workloads

6 of 26

Running jobs at the UChicago RCC

Compute Nodes

RCC Users

University Shared

Storage

Faculty CPP

Storage

Compute Nodes

7 of 26

Running jobs at the UChicago RCC

Compute Nodes

RCC Users

University Shared

Storage

Storage

Compute Nodes

Faculty CPP

Storage

Compute Nodes

Commercial Clouds

AWS-GCP-OCI-Azure

8 of 26

HPC login nodes

Storage

Cloud storage

Compute nodes

Cloud instances

/software/[selected]

/project/pi-account

Users

Cloud API wrapper

skyway module

On premises (Midway)

On the cloud (AWS, GCP, Azure, Oracle)

Slurm wrapper

9 of 26

Skyway is a flexible tool that offers

  • HPC end users
    • easy ways to run heavy workflows on the cloud
    • easy ways to interact with multiple cloud resources
  • Cloud account holders
    • easy ways to monitor and manage the budget per user
    • easy ways to manage multiple cloud accounts
  • HPC admins
    • easy ways to add cloud accounts and cloud vendors
    • easy ways to install, maintain, troubleshoot and upgrade

10 of 26

11 of 26

Success stories

  • U Chicago Master of Science in Applied Data Science program with Google Cloud Platform (GCP)
  • Generative AI workflows enabled with Amazon Web Services (AWS) Bedrock
  • Quantum Collaboration: U Chicago and Tokyo University through the GCP Gift 2025-2026
  • Oracle Cloud - Skyway integration

12 of 26

User Support and Engagement

  • For end users:
    • To use the cloud account, to raise the allocated budget, or report issues, send email to our Help Desk help@rcc.uchicago.edu
  • For cloud account holders:
    • connecting existing cloud accounts with Skyway
    • managing credits and allocation for individual users
    • usage reporting
  • For developers and contributors
    • creating issues and submitting pull requests

13 of 26

Main components

Command Line Interface

(accessible via SSH, ThinLinc, Open OnDemand)

Dashboard

(accessible via ThinLinc and Open OnDemand)

Skyway API

14 of 26

15 of 26

Skyway CLI: Slurm-inspired commands

# List all the node/VM types

skyway_nodetypes --account=liangjiang-gcp

# Submit an interactive job (--account and –A are equivalent), wall time is required

skyway_interactive -A liangjiang-gcp –constraint=c1--time=02:00:00

# Submit a batch job

skyway_batch job_script.sh

# Transfer data from Midway to the cloud node, and from the cloud node to Midway

skyway_transfer -A liangjiang-gcp -J your-run /path/to/input-data

skyway_transfer –A liangjiang-gcp –from-cloud –cloud-path ~/output-data /home/$USER

# List all the running/stopped VMs of a cloud account

skyway_list --account=liangjiang-gcp

# Cancel/terminate a job

skyway_cancel --account=liangjiang-gcp your-run

module load skyway

16 of 26

Recommend suitable instances for a workload

Skyway Advisor

Job submitted with cloud account ndtrung-oci

Cloud vendor: oci

Requested resource:

+ CPU cores = 2

+ Memory = 4 GB

+ GPUs = 1

Suggested node types for job_script.sh:

Instance Type Cores Mem (GB) GPUs GPU type Per-hour Cost, $

--------------- ------- ---------- ------ ----------- ------------------

VM.GPU.A10.1 2 8 1 nvidia-a10 2

VM.GPU.H100.8 2 8 8 nvidia-h100 10

skyway_advisor job_script.sh

17 of 26

Usage history and job accounting

skyway_usage -A liangjiang-gcp –u ndtrung --byjob

18 of 26

Demo: Working with AWS EC2 instances from login node

19 of 26

Essential operations with Skyway CLI

Commands

Description

skyway_info

List the useful commands

skyway_nodetypes

List all types of cloud nodes, aka virtual machines (VM)

skyway_usage

Show the usage

skyway_alloc

Provision a cloud code

skyway_list

List all the running and stopped nodes

skyway_transfer

Transfer data from/to running nodes

skyway_connect

Connect (via SSH) to a cloud node

skyway_stop

Stop an instance (keep the data): Stopped instances are not charged because the CPUs, RAM and GPUs are released to the pool.

skyway_restart

Restart a stopped instance: allocate a new VM with the persistent data

skyway_cancel

Cancel/terminate a running node, all data on the VM will be erased

skyway_interactive

Provision and then connect to a cloud node

skyway_batch

Submit a batch script to a cloud node: Terminated jobs are stopped VMs that can be restarted.

20 of 26

Minimal changes to Slurm-like job script for submitting to cloud from HPC login nodes

[your-cnetid@midway3-login4 ~] cat job_script.sh

#!/bin/sh

#SBATCH -job-name=my-qc-run1

#SBATCH --account=liangjiang-gcp

#SBATCH --nodes=1

#SBATCH --time=05:00:00

#SBATCH --constraint=g1

# using skyway command to transfer data to the VM

skyway_transfer training.py input.data --cloud-path /tmp/project/

# inside the VM

source activate scicomp

python calculate.py > ~/output.txt

skyway_batch job_script.sh

21 of 26

Typical workflow #1: Interactive jobs

  1. Login to your on-prem HPC login node
  2. Load the skyway module
    • module load skyway
  3. Request an interactive node
    • skyway_interactive
  4. Once on the VM, upload/download data and install software packages in the persistent storage /tmp/gcs and run the calculations
  5. Stop the VM (e.g., for a break)
    • skyway_stop
  6. Restart the VM to resume the work with persistent data
    • skyway_restart
  7. Terminate the VM once done
    • skyway_cancel

22 of 26

Typical workflow #2: Batch jobs

  1. Login to your on-prem HPC login node
  2. Load the skyway module
    • module load skyway
  3. Prepare a job script from an existing job script
  4. Submit the job script
    • skyway_batch
  5. List the running and stopped VMs
    • Running VMs: SSH into the VM with skyway_connect
    • Stopped VMs: restart the VM with skyway_restart to get access to the data to download to Midway storage
  6. Cancel the job
    • skyway_cancel

23 of 26

Direct access to the instances made easy

  • SSH connection
  • Jupyter Notebook
  • VS Code

24 of 26

Launch & connect to the Jupyter server on the VM

  1. Log in to Midway3 login node, module load skyway
  2. Create a VM with an interactive session with skyway_interactive
    • skyway_interactive -A liangjiang -c c4 –t 02:00:00 -J my-run01
  3. Once a VM is ready, its public IP is ready and SSH tunnel is set up with a port number, e.g. 21471
  4. The prompt is now on the VM, activate the Python environment scicomp under $HOME
    • source scicomp/bin/activate
  5. Launch Jupyter Notebook with the port number
    • jupyter notebook --no-browser --ip=127.0.0.1 --port 21471
  6. The Jupyter server is then launched at port 21471 on the VM and returns a URL, something like http://127.0.0.1:28875/tree?token=b1ee21e419bd59dede01ab4bda37499597ea7e0b99a968f
  7. On another SSH session on the Midway3 login node, forward a port from the login node to the VM (using the private key)
    • ssh -i ~/.my_gcp_ssh_key.pem –L 21471:localhost:21471 [your-user-name]@[vm_public_ip]
  8. On your Mac laptop, forward a port on your machine to the login node (using 2FA)
    • ssh -N -f -L 21471:localhost:21471 [your-user-name]@midway3.rcc.uchicago.edu
  9. Finally, on your Mac, copy the URL above to your web browser address bar to connect to the Jupyter session on the VM.

25 of 26

Connect to the VM with VS Code

  • Once a VM is up and running, its public IP is available.
  • Install the Remote SSH Extension for VS Code
  • Connect to the VM with its public IP, your CNetID, and the private key under your home folder on Midway3 login node ~/.my_gcp_ssh_key.pem
    • ssh -i ~/.my_gcp_ssh_key.pem [your-user-name]@[vm_public_ip]

26 of 26

Summary

Skyway offers a flat platform to get access to cloud resources:

  • Easy to learn, easy to use
  • Support HPC/AI workloads
  • Easy to customize and add new features
  • Easy to monitor and manage
  • Easy to incorporate AI/ML features

Next